一个基于scrapy 框架的爬虫用于爬取和讯网用户博文数据(标题,链接,点击量,评论数),并用mypysql接口将其自动写入MySQL数据库。设置了浏览器伪装,自动翻页迭代,成功实现。
-
Notifications
You must be signed in to change notification settings - Fork 0
TrumanH/-hexun-blog
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
scrapy 和讯网博客爬虫,博文信息标题链接点击量评论数等。(数据通过接口存入MySQL。)
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published