Starting with a working Scrapy spider that yields to a CSV I modify the code, add pipelines.py and configure a MySQL database and troubleshoot user permissions and Scrapy files until we have a working spider sending data to MySQL.
The MySQL server is "remote" so I highlight the difference between:
GRANT ALL PRIVILEGES ON newz.* TO 'user1'@'localhost';
and
GRANT ALL PRIVILEGES ON newz.* TO 'user1'@'%';
There is still more to do on this, but if you can follow/use my code then you will be able to add more spiders easily as they will use the same items.py file, and use the same columns in the database.
Pipelines.py will also be set up ready, so the only adjustment to make will be to stop it deleting the table each time.
(# Comment out line 39 in pipelines.py)
All of the code is at : ?
https://github.com/RGGH/Scrapy14I am purposefully documenting this project in depth as hopefull you can use it for reference when making your own Scrapy/MySQL spiders.
## Chapters ##
0:00 Intro - me talking
3:12 Showing GitHub - Scrapy14
4:29 "ModuleNotFoundError" - a solution
5:42 Python script to format headers into a headers dictionary
5:58 Running Scrapy Spider as a Script
11:11 Empty database!
17:19 Comparing against a previous successful MySQL project
20:22 Pipelines now working
24:16 Allow larger VARCHAR for URL
26:00 Working!
Visit redandgreen blog for more Tutorials
=========================================
?
http://redandgreen.co.uk/about/blog/Subscribe to the YouTube Channel
=================================
?
https://www.youtube.com/c/DrPiCodeFollow on Twitter - to get notified of new videos
=================================================
?
https://twitter.com/RngWebBuy Dr Pi a coffee (or Tea)
☕
https://www.buymeacoffee.com/DrPiThumbs up yeah? (cos Algos..)
#webscraping #MySQL #python
...
https://www.youtube.com/watch?v=4_bUfzSg0ds