Goodreads Advanced Search

This project is a web-based tool to scrape and filter books from Goodreads. It allows users to:

Scrape book data from different genres on Goodreads.
Apply filters like minimum number of ratings and genre-based selection.
View scraped books with details like title, author, number of ratings, and genre. Sorted by avg ratings.

The backend scrapes Goodreads pages and stores book information in a JSON Lines file. The frontend allows users to filter and view the books using a simple web interface.

Features:

Scraping: Scrape books from a specified genre on Goodreads (up to 25 pages).
Filtering: Filter books based on genres and minimum ratings.
Progress Bar: Real-time progress bar during scraping.

Demo

Filter feature is available on https://kardeepakkumar.github.io/goodreads-advanced-search

Click the image below to watch the demo video:

Requirements

Docker/Python

Installation

1. Clone this repository:

git clone https://github.com/kardeepakkumar/goodreads-advanced-search.git
cd goodreads-advanced-search

2. Copy the cookie from goodreads

Go to a goodreads page on your browser, press F12 and copy cookie data. Store copied cookie data in goodreads-advanced-search/cookie.txt.

3. Build and run the docker container, or run directly

Use one of these two methods to run the app locally

3.1 Build and run using docker

Make sure docker is installed on your machine.

docker build -t goodreads-scraper .
docker run -p 5000:5000 -v $(pwd)/books_raw.jl:/app/books_raw.jl -v $(pwd)/cookie.txt:/app/cookie.txt goodreads-scraper

This will mount the books_raw.jl file and the cookie.txt file, allowing the scraper to store data and use the cookies for authentication.

3.2 Run flask app directly

Create a venv optionally

pip install -r requirements.txt
python app.py

4. Access the web app

Once the app is running, open your browser and go to http://localhost:5000.

How to use

The books_raw.jl file in the repo already has ~20k books metadata with it. Scraping more genres will automatically add to the local version of this file for you.

Filter Books

Use the filter options to narrow down the displayed books based on genres and minimum ratings.

Scrape Genre

Choose a genre (e.g., Biography) and press "Scrape".
The app will start scraping books from that genre.
A progress bar will be shown and updated during the scraping process.

License

This project is licensed under the MIT License. See the LICENSE file for more information.

Important Notice: This project scrapes data from Goodreads. While the data is used solely for personal or non-commercial purposes, it is important to acknowledge Goodreads' Terms of Service. Please do not use the scraped data for commercial purposes without prior consent from Goodreads.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
docs		docs
static		static
templates		templates
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
convertToJson.py		convertToJson.py
demo.png		demo.png
filter_books.py		filter_books.py
requirements-testing.txt		requirements-testing.txt
requirements.txt		requirements.txt
scraper.py		scraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Goodreads Advanced Search

Features:

Demo

Requirements

Installation

1. Clone this repository:

2. Copy the cookie from goodreads

3. Build and run the docker container, or run directly

3.1 Build and run using docker

3.2 Run flask app directly

4. Access the web app

How to use

Filter Books

Scrape Genre

License

About

Releases

Packages

Languages

License

kardeepakkumar/goodreads-advanced-search

Folders and files

Latest commit

History

Repository files navigation

Goodreads Advanced Search

Features:

Demo

Requirements

Installation

1. Clone this repository:

2. Copy the cookie from goodreads

3. Build and run the docker container, or run directly

3.1 Build and run using docker

3.2 Run flask app directly

4. Access the web app

How to use

Filter Books

Scrape Genre

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages