Skip to content
This repository has been archived by the owner on Jan 31, 2024. It is now read-only.

This Python script scrapes data from 'https://almeera.online/' and saves it in JSON files. It uses BeautifulSoup for parsing, requests for web requests, and joblib for parallel processing. The AlmeeraScrapper class extracts category, subcategory, and product data, saving it in structured JSON files.

Notifications You must be signed in to change notification settings

Hassanzamir47/almeera_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Almeera Web Scraper

This Python script is designed to scrape data from the 'Almeera' website (https://almeera.online/). It can extract information about categories, subcategories, and products, and save the data in JSON files. The script uses libraries like BeautifulSoup, requests, and joblib for efficient web scraping and data processing.

Features

  • Extracts data from the 'Almeera' website.
  • Retrieves information about categories, subcategories, and products.
  • Downloads and stores images associated with categories and products.
  • Parallel processing for faster data extraction.

Requirements Before running the script, make sure you have the following Python libraries installed:

  • BeautifulSoup
  • requests
  • joblib
  • tqdm

You can install these libraries using pip: pip install beautifulsoup4 requests joblib tqdm

Usage

  • Clone or download this repository to your local machine.
  • Open the script in a Python environment that meets the requirements mentioned above.
  • Customize the main_page_url, output_jsons_path, and output_images_path variables according to your needs.

Output The script will generate JSON files containing structured data for each category, subcategory, and their associated products. Images will also be downloaded and stored in the specified output directory.

About

This Python script scrapes data from 'https://almeera.online/' and saves it in JSON files. It uses BeautifulSoup for parsing, requests for web requests, and joblib for parallel processing. The AlmeeraScrapper class extracts category, subcategory, and product data, saving it in structured JSON files.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages