Skip to content

Scrape a web page for pdf files and download them all locally.

License

Notifications You must be signed in to change notification settings

scottgriv/python-pdf_web_scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Python Badge
GitHub Badge Email Badge BuyMeACoffee Badge
Bronze


Python PDF Web Scraper

A simple Python script that scrapes web pages for PDF files and downloads them to a local directory.


Table of Contents

Getting Started

  1. Clone this repository.
  2. Install Python.
  3. Install Pip.
  4. Install the required packages using pip install -r requirements.txt in your terminal.
  5. Place the web page URL and output file location in the main.py file here:
# Define your URL
url = "https://yourWebsiteURL"

# By default, the script will download PDF files to the downloads folder.
# You can change the folder location by updating the folder_location variable.
# Example: folder_location = r'/Users/yourname/Documents'

folder_location = r'./downloads'
  1. Run the script: python main.py
  2. PDF files will be downloaded to your local directory.

Resources

License

This project is released under the terms of The Unlicense, which allows you to use, modify, and distribute the code as you see fit.

  • The Unlicense removes traditional copyright restrictions, giving you the freedom to use the code in any way you choose.
  • For more details, see the LICENSE file in this repository.

Credits

Author: Scott Grivner
Email: scott.grivner@gmail.com
Website: scottgrivner.dev
Reference: Main Branch