Skip to content

Jack0thy/webscraping-and-analysis-of-medium-articles

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraping and Analysis of Medium articles

Web scraping automatically extracts data and presents it in a format you can easily make sense of.We are going to use Python as our scraping language, together with a simple and powerful library, BeautifulSoup

For Dynamically loading the webpage we use Selenium along with chromedriver

Selenium WebDriver is a collection of open source APIs which are used to automate the testing of a web application. Description: Selenium WebDriver tool is used to automate web application testing to verify that it works as expected. It supports many browsers such as Firefox, Chrome, IE, and Safari.

Scraping Rules

  • You should check a website’s Terms and Conditions before you scrape it. Be careful to read the statements about legal use of data. Usually, the data you scrape should not be used for commercial purposes.
  • Do not request data from the website too aggressively with your program (also known as spamming), as this may break the website. Make sure your program behaves in a reasonable manner (i.e. acts like a human). One request for one webpage per second is good practice.
  • The layout of a website may change from time to time, so make sure to revisit the site and rewrite your code as needed

Exploratory analysis of the data has also been done

Analysis is done

  • Author wise
  • Month wise
  • Tag wise and so on...

The resulting visualizations help us understand data science based medium articles better...

About

Scraping medium articles tagged under ML,DL and AI and performing Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%