Web scraping tool designed to effortlessly navigate websites and automatically download all types of files.
WebWorm
is a python script that scrapes and downloads files from a specified website URL. It allows configuring the depth of website crawling and the file extensions to scrape. It also has an option to detect technologies used by the website.
This is an example of how you may run the script.
- Ensure you have Python installed on your system.
- Clone the repository:
git clone https://github.com/m0hs1ne/WebWorm.git
- Install the required packages:
pip install -r requirements.txt
- Run the script:
python3 webworm.py -u <url> -d <depth> -e <extensions> -t <technologies>
usage: WebWorm.py [-h] [-e EXTENSIONS] [-d DEPTH] [-t] url
positional arguments:
url The URL of the website to scrape.
options:
-h, --help show this help message and exit
-e EXTENSIONS, --extensions EXTENSIONS
Comma-separated list of file extensions to scrape (e.g., "jpg,png,docx"). If not specified, all files will be scraped.
-d DEPTH, --depth DEPTH
The maximum depth to crawl the website. Default is 1.
-t, --tech Detect technologies used on the website.
using the -t
flag will detect technologies used by the website.
- Add support for scraping multiple websites.
- Request with session cookies.
- enumerate directories.
- check for possible keys and secrets in js files.
Your contributions are welcome! Whether you're fixing bugs, adding new features, or improving documentation, we appreciate your help in making WebWorm better.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
- m0hs1ne - Initial work