Skip to content

scrapy 和讯网博客爬虫,博文信息标题链接点击量评论数等。(数据通过接口存入MySQL。)

Notifications You must be signed in to change notification settings

TrumanH/-hexun-blog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 

Repository files navigation

-hexun-blog

一个基于scrapy 框架的爬虫用于爬取和讯网用户博文数据(标题,链接,点击量,评论数),并用mypysql接口将其自动写入MySQL数据库。设置了浏览器伪装,自动翻页迭代,成功实现。

About

scrapy 和讯网博客爬虫,博文信息标题链接点击量评论数等。(数据通过接口存入MySQL。)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages