GitHub - Hassanzamir47/almeera_scraper: This Python script scrapes data from 'https://almeera.online/' and saves it in JSON files. It uses BeautifulSoup for parsing, requests for web requests, and joblib for parallel processing. The AlmeeraScrapper class extracts category, subcategory, and product data, saving it in structured JSON files.

Almeera Web Scraper

This Python script is designed to scrape data from the 'Almeera' website (https://almeera.online/). It can extract information about categories, subcategories, and products, and save the data in JSON files. The script uses libraries like BeautifulSoup, requests, and joblib for efficient web scraping and data processing.

Features

Extracts data from the 'Almeera' website.
Retrieves information about categories, subcategories, and products.
Downloads and stores images associated with categories and products.
Parallel processing for faster data extraction.

Requirements Before running the script, make sure you have the following Python libraries installed:

BeautifulSoup
requests
joblib
tqdm

You can install these libraries using pip: pip install beautifulsoup4 requests joblib tqdm

Usage

Clone or download this repository to your local machine.
Open the script in a Python environment that meets the requirements mentioned above.
Customize the main_page_url, output_jsons_path, and output_images_path variables according to your needs.

Output The script will generate JSON files containing structured data for each category, subcategory, and their associated products. Images will also be downloaded and stored in the specified output directory.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Code		Code
Data/Category JSONs		Data/Category JSONs
images		images
README.md		README.md
documentation.pdf		documentation.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

Hassanzamir47/almeera_scraper

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages