Skip to content

upenndigitalscholarship/regulations-gov-comment-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 

Repository files navigation

Regulations.gov Scraper

This tool scrapes Regulations.gov comment information including the submitter name, organization and any attachments by document id.

This is a very barebones tool! We haven't even provided a CLI. To run, ensure you have the Python requests library installed. (It comes with Anaconda; if you're new to Python, we recommend installing Anaconda and letting it do the heavy lifting for you.)

You'll also need to apply for an API key as described here.

Once you have Python and the requests library installed, and have received an API key, make the following changes:

  1. API Key:

    In the line that reads

    api_key = '' # insert your api key between quotes
    

    copy your API key and paste it between the signle quotation marks:

    api_key = '(THE API KEY THAT YOU COPIED)'
    
  2. Docket ID:

    In the line that reads

    docket_id = '' # insert the docket id between quotes (e.g. VA-2016-VHA-0011)
    

    paste the docket ID between the single quotation marks:

    docket_id = 'ED-2018-OCR-0064' 
    

    The docket ID appears on the page for the set of comments you're scraping: A screenshot of the docket ID

  3. Total number of documents:

    In the line that reads

    total_docs = 217568  # total number of documents, as indicated by the page for the given docket id
    

    paste the number of documents in place of the current value:

    total_docs = 14835
    

    The number of documents also appears on the page for the set of comments you're scraping: A screenshot of the number of documents

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages