ScrapeGoat

ScrapeGoat is an advanced web scraping tool that leverages artificial intelligence and browser automation to meet diverse scraping needs. Whether you're comparing prices across shopping websites, tracking Instagram user posts, or automating social media uploads, ScrapeGoat provides a powerful and flexible solution.

Introduction

Scrape the web like a GOAT! ScrapeGoat is the ultimate web scraping tool, powered by AI steroids. This beast of a scraper empowers users to efficiently collect and analyze web data, transforming raw information into actionable insights faster than you can say "baa". By combining AI-driven decision-making with robust browser automation, ScrapeGoat offers a cutting-edge approach to web scraping that's not just powerful and user-friendly, it's downright revolutionary. Get ready to become the GOAT of web scraping!

Features

AI-Powered Scraping: Utilizes machine learning algorithms to adapt to website changes and optimize scraping strategies.
Browser Automation: Mimics human-like browsing behavior to navigate websites and extract data seamlessly.
Multi-Purpose Functionality: Suitable for a wide range of applications, including e-commerce price comparison, social media monitoring, and content aggregation.
Customizable Scraping Workflows: Create and save custom scraping recipes for repeated tasks.
Data Export: Export scraped data in various formats (CSV, JSON, XML) for easy integration with other tools and platforms.
Scheduling: Set up automated scraping tasks to run at specified intervals.
Proxy Support: Rotate through proxy servers to avoid IP blocks and maintain anonymity.
CAPTCHA Handling: Advanced CAPTCHA solving capabilities to bypass common anti-bot measures.

Dependencies

ScrapeGoat relies on the following key dependencies:

Python 3.10+
Selenium
Requests
portable google chrome browser 131.0.6724.0+
chromedriver 131.0.6724.0+

A complete list of dependencies can be found in the requirements.txt file.

Usage

For a comprehensive guide on how to use ScrapeGoat, please refer to our User Manual.

Contributing

We welcome contributions from the community! If you'd like to contribute to ScrapeGoat, please follow these steps:

Fork the repository
Create a new branch for your feature or bug fix
Make your changes and commit them with clear, descriptive messages
Push your changes to your fork
Submit a pull request to the main repository

Please ensure that your code adheres to our coding standards and includes appropriate tests. For more information, see our Contribution Guidelines.

    # dependencies, install if needed!
    sudo apt install x11-xserver-utils xorg
    xhost + # allow from anywhere to connect (used for opening gui from within Container)
    # run your development container
    docker build -t scrapegoat . -f GoatFile
    docker run --rm -it scrapegoat:latest /bin/sh
    docker run -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix scrapegoat:latest

License

ScrapeGoat is released under the MIT License. See the LICENSE file for details.

Support

If you encounter any issues or have questions, please file an issue on our GitHub Issues page.

For additional support and community discussions, join our Discord server.

Disclaimer: Please use ScrapeGoat responsibly and in accordance with the terms of service of the websites you are scraping. The developers of ScrapeGoat are not responsible for any misuse of the tool or violations of website policies.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github		.github
scripts		scripts
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GoatFile		GoatFile
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
RELEASENOTES.md		RELEASENOTES.md
Readme.md		Readme.md
main.spec		main.spec
sonar-project.properties		sonar-project.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScrapeGoat

Introduction

Features

Dependencies

Usage

Contributing

License

Support

About

Releases 1

Packages

Contributors 2

Languages

License

atonomic/scrapegoat

Folders and files

Latest commit

History

Repository files navigation

ScrapeGoat

Introduction

Features

Dependencies

Usage

Contributing

License

Support

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages