Skip to content

This project focuses on web scraping using Python libraries like Requests and BeautifulSoup 🐍 to extract and analyze data from websites 🌐. The extracted data is structured into tables using Pandas πŸ“Š, enabling easy analysis and visualization πŸ“ˆ, helping to gather valuable insights from web content. πŸ’‘

Notifications You must be signed in to change notification settings

Bhushan148/Web-scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸ•΅οΈβ€β™‚οΈ Web Scraping Projects πŸ“Š

Welcome to my Web Scraping projects repository! Here, you'll find various projects demonstrating my ability to extract data from different web sources using Python. Each project showcases specific web scraping techniques and data extraction methods.

πŸš€ Projects Overview

1. Save HTML Webpage πŸ–₯️

Objective:
To scrape data by saving a complete webpage locally, ensuring accuracy and completeness of the extracted information.

Technologies Used:

  • Python 🐍
  • Requests library 🌐
  • BeautifulSoup library 🍲
  • Pandas πŸ“Š

Project Description:
In this project, I:

  • Saved the entire webpage locally using the Requests library to ensure a static version of the content. πŸ—‚οΈ
  • Extracted data step by step from the saved HTML using BeautifulSoup. πŸ”
  • Compiled all extracted data into a structured format using Pandas. πŸ“ˆ

Outcome:
Successfully demonstrated web scraping by saving the webpage first, ensuring data accuracy, and extracting and structuring the information into a tabular format. πŸ†

See Project Jupyter File Here:
Save HTML Webpage Jupyter File


2. Web Scraping IMDB Webpage 🎬

Objective:
To perform live web scraping on the IMDB website to extract and store movie data.

Technologies Used:

  • Python 🐍
  • Requests library 🌐
  • BeautifulSoup library 🍲
  • Pandas πŸ“Š

Project Description:
In this project, I:

  • Fetched live data directly from the IMDB website using the Requests library. 🌍
  • Parsed the HTML content with BeautifulSoup to extract movie information such as titles, ratings, and release dates. 🍿
  • Structured the data into a Pandas DataFrame for easy analysis and storage. πŸ“‹

Outcome:
Successfully extracted and stored movie data in a tabular format, demonstrating effective live web scraping techniques. πŸš€

See Project Jupyter File Here:
IMDB Web Scraping Jupyter File


3. Web Scraping Wikipedia Tables πŸ“š

Objective:
To scrape table data from Wikipedia, focusing on the 8th table of a page with population data.

Technologies Used:

  • Python 🐍
  • Requests library 🌐
  • BeautifulSoup library 🍲
  • Pandas πŸ“Š

Project Description:
In this project, I:

  • Fetched HTML content of a Wikipedia page with multiple tables using Requests. πŸ“œ
  • Parsed and extracted data from the 8th table using BeautifulSoup. πŸ“Š
  • Converted the data into a Pandas DataFrame for structured storage and analysis. πŸ—ƒοΈ

Outcome:
Successfully extracted population data from a specific Wikipedia table, showcasing skills in handling structured data and extracting useful information from web tables. 🎯

See Project Jupyter File Here:
Wikipedia Tables Web Scraping Jupyter File


πŸ”— Additional Information


🎯 Inspiration

"Every webpage tells a story waiting to be uncovered. Keep scraping, keep discovering!" βœ¨πŸ”


πŸ‘¨β€πŸ’» Project Developed By

Bhushan Gawali - Data Analyst

Feel free to explore the repositories for detailed insights and code. If you have any questions or suggestions, don't hesitate to reach out!

Happy scraping! 🌟

About

This project focuses on web scraping using Python libraries like Requests and BeautifulSoup 🐍 to extract and analyze data from websites 🌐. The extracted data is structured into tables using Pandas πŸ“Š, enabling easy analysis and visualization πŸ“ˆ, helping to gather valuable insights from web content. πŸ’‘

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published