Welcome to my Web Scraping projects repository! Here, you'll find various projects demonstrating my ability to extract data from different web sources using Python. Each project showcases specific web scraping techniques and data extraction methods.
Objective:
To scrape data by saving a complete webpage locally, ensuring accuracy and completeness of the extracted information.
Technologies Used:
- Python π
- Requests library π
- BeautifulSoup library π²
- Pandas π
Project Description:
In this project, I:
- Saved the entire webpage locally using the Requests library to ensure a static version of the content. ποΈ
- Extracted data step by step from the saved HTML using BeautifulSoup. π
- Compiled all extracted data into a structured format using Pandas. π
Outcome:
Successfully demonstrated web scraping by saving the webpage first, ensuring data accuracy, and extracting and structuring the information into a tabular format. π
See Project Jupyter File Here:
Save HTML Webpage Jupyter File
Objective:
To perform live web scraping on the IMDB website to extract and store movie data.
Technologies Used:
- Python π
- Requests library π
- BeautifulSoup library π²
- Pandas π
Project Description:
In this project, I:
- Fetched live data directly from the IMDB website using the Requests library. π
- Parsed the HTML content with BeautifulSoup to extract movie information such as titles, ratings, and release dates. πΏ
- Structured the data into a Pandas DataFrame for easy analysis and storage. π
Outcome:
Successfully extracted and stored movie data in a tabular format, demonstrating effective live web scraping techniques. π
See Project Jupyter File Here:
IMDB Web Scraping Jupyter File
Objective:
To scrape table data from Wikipedia, focusing on the 8th table of a page with population data.
Technologies Used:
- Python π
- Requests library π
- BeautifulSoup library π²
- Pandas π
Project Description:
In this project, I:
- Fetched HTML content of a Wikipedia page with multiple tables using Requests. π
- Parsed and extracted data from the 8th table using BeautifulSoup. π
- Converted the data into a Pandas DataFrame for structured storage and analysis. ποΈ
Outcome:
Successfully extracted population data from a specific Wikipedia table, showcasing skills in handling structured data and extracting useful information from web tables. π―
See Project Jupyter File Here:
Wikipedia Tables Web Scraping Jupyter File
- Complete Webpage Project: Complete Webpage Here
- IMDB Top 250 Movies: IMDB Top 250
- Wikipedia Demographics of India: Wikipedia Page
"Every webpage tells a story waiting to be uncovered. Keep scraping, keep discovering!" β¨π
Bhushan Gawali - Data Analyst
Feel free to explore the repositories for detailed insights and code. If you have any questions or suggestions, don't hesitate to reach out!
Happy scraping! π