A versatile and user-friendly bash script that scrapes and downloads comics from various websites, ultimately packaging them into .cbz files for easy reading.
- You will need to have
bash
,curl
,pup
,awk
,sed
,tr
,wget
, andzip
installed on your system. - To install
pup
(a command line parser for HTML), you can follow the instructions given here.
- Clone the repository:
git clone https://github.com/webmatze/comics-scraper.git
- Change the permissions of the script to make it executable:
chmod +x comics_scraper.sh
The script can be used with the following command:
./comics_scraper.sh [OPTIONS]
- -u, --url URL: Specify the base URL to scrape comics from (mandatory).
- -p, --post-selector SELECTOR: Specify the CSS selector to extract post URLs (default: #post-area .post > center a attr{href}).
- -i, --image-selector SELECTOR: Specify the CSS selector to extract image URLs (default: center center div img[style*="max-width"] attr{src}).
- -d, --download-path PATH: Specify the base download path (default: data).
- -v, --verbose: Show verbose output.
- -h, --help: Show the help message.
By default, the script uses a set of default selectors to locate and download the comics.
./comics_scraper.sh -u https://readallfreecomics.com/
However, you can customize these settings by providing additional arguments.
./comics_scraper.sh -u https://readallfreecomics.com/ -p "#post-area .post > center a attr{href}" -i "center center div img[style*=\"max-width\"] attr{src}" -d data -v
This project is for education purposes only. Please use it only for free comics. I do not support illegal downloads!
If you wish to contribute to this project, please submit an issue or a pull request!