Auto-scrape is a platform for building, managing and remotely deploying web scrapers. It provides the "essential infrastructure" for web scraping while allowing developers to focus on writng Selenium web scraping scripts in a simple and familiar way.
It is built using the Flask framework and uses SQLAlchemy to interface with the SQL database of your choice.
GIF screenshots demonstrating the user interface in action here.
- live progress logging
- database for saving scraped data - no database experience required!
- CSV export
- multiple simultaneous scrapers
- basic resource management
- basic user authenticalion for remote deployments (see fea-simple-auth branch
- Download chromedriver and place it in
/autoscrape
. Rename tochromedriver
. - Install dependencies:
pip install -r requirements.txt
- Set environment variables:
Windows:
$env:AUTOSCRAPE_ADMIN_USERNAME="your_admin_username"
$env:AUTOSCRAPE_ADMIN_PASSWORD="your_admin_password"
MacOS / Linux:
export AUTOSCRAPE_ADMIN_USERNAME="your_admin_username"
export AUTOSCRAPE_ADMIN_PASSWORD="your_admin_password"
You could also store authentication details this way for scrapers run behind a paywall.
-
Start scraping:
- Windows:
./dev.ps1
- MacoS / Linux:
source ./dev.sh
- Windows: