URS v3.2.1
Release date: March 28, 2021
Summary
- Structured comments export has been upgraded to include comments of all levels.
- Structured comments are now the default export format. Exporting to raw format requires including the
--raw
flag.
- Structured comments are now the default export format. Exporting to raw format requires including the
- Tons of metadata has been added to all scrapers. See the Full Changelog section for a full list of attributes that have been added.
Credentials.py
has been deprecated in favor of.env
to avoid hard-coding API credentials.- Added more terminal eye candy - Halo has been implemented to spice up the output.
Full Changelog
Added
- User interface
- Added Halo to spice up the output while maintaining minimalism.
- Source code
- Created a comment
Forest
and accompanyingCommentNode
.- The
Forest
contains methods for insertingCommentNode
s, including a depth-first search algorithm to do so.
- The
Subreddit.py
has been refactored and submission metadata has been added to scrape files:"author"
"created_utc"
"distinguished"
"edited"
"id"
"is_original_content"
"is_self"
"link_flair_text"
"locked"
"name"
"num_comments"
"nsfw"
"permalink"
"score"
"selftext"
"spoiler"
"stickied"
"title"
"upvote_ratio"
"url"
Comments.py
has been refactored and submission comments now include the following metadata:"author"
"body"
"body_html"
"created_utc"
"distinguished"
"edited"
"id"
"is_submitter"
"link_id"
"parent_id"
"score"
"stickied"
- Major refactor for
Redditor.py
on top of adding additional metadata.- Additional Redditor information has been added to scrape files:
"has_verified_email"
"icon_img"
"subreddit"
"trophies"
- Additional Redditor comment, submission, and multireddit metadata has been added to scrape files:
subreddit
objects are nested withincomment
andsubmission
objects and contain the following metadata:"can_assign_link_flair"
"can_assign_user_flair"
"created_utc"
"description"
"description_html"
"display_name"
"id"
"name"
"nsfw"
"public_description"
"spoilers_enabled"
"subscribers"
"user_is_banned"
"user_is_moderator"
"user_is_subscriber"
comment
objects will contain the following metadata:"type"
"body"
"body_html"
"created_utc"
"distinguished"
"edited"
"id"
"is_submitter"
"link_id"
"parent_id"
"score"
"stickied"
"submission"
- contains additional metadata"subreddit_id"
submission
objects will contain the following metadata:"type"
"author"
"created_utc"
"distinguished"
"edited"
"id"
"is_original_content"
"is_self"
"link_flair_text"
"locked"
"name"
"num_comments"
"nsfw"
"permalink"
"score"
"selftext"
"spoiler"
"stickied"
"subreddit"
- contains additional metadata"title"
"upvote_ratio"
"url"
multireddit
objects will contain the following metadata:"can_edit"
"copied_from"
"created_utc"
"description_html"
"description_md"
"display_name"
"name"
"nsfw"
"subreddits"
"visibility"
interactions
are now sorted in alphabetical order.
- Additional Redditor information has been added to scrape files:
- CLI
- Flags
--raw
- Export comments in raw format instead (structure format is the default)
- Flags
- Created a new
.env
file to store API credentials.
- Created a comment
README
- Added new bullet point for The Forest Markdown file.
- Tests
- Added a new test for the
Status
class inGlobal.py
.
- Added a new test for the
- Repository documents
- Added "The Forest".
- This Markdown file is just a place where I describe how I implemented the
Forest
.
- This Markdown file is just a place where I describe how I implemented the
- Added "The Forest".
Changed
- User interface
- Submission comments scraping parameters have changed due to the improvements made in this pull request.
- Structured comments is now the default format.
- Users will have to include the new
--raw
flag to export to raw format.
- Users will have to include the new
- Both structured and raw formats can now scrape all comments from a submission.
- Structured comments is now the default format.
- Submission comments scraping parameters have changed due to the improvements made in this pull request.
- Source code
- The submission comments JSON file's structure has been modified to fit the new
submission_metadata
dictionary."data"
is now a dictionary that contains the submission metadata dictionary and scraped comments list. Comments are now stored in the"comments"
field within"data"
. - Exporting Redditor or submission comments to CSV is now forbidden.
- URS will ignore the
--csv
flag if it is present while trying to use either scraper.
- URS will ignore the
- The
created_utc
field for each Subreddit rule is now converted to readable time. requirements.txt
has been updated.- As of v1.20.0,
numpy
has dropped support for Python 3.6, which means Python 3.7+ is required for URS..travis.yml
has been modified to exclude Python 3.6. Added Python 3.9 to test configuration.- Note: Older versions of Python can still be used by downgrading to numpy<=1.19.5.
- As of v1.20.0,
- Reddit object validation block has been refactored.
- A new reusable module has been defined at the bottom of
Validation.py
.
- A new reusable module has been defined at the bottom of
Urs.py
no longer pulls API credentials fromCredentials.py
as it is now deprecated.- Credentials are now read from the
.env
file.
- Credentials are now read from the
- Minor refactoring within
Validation.py
to ensure an extra Halo line is not rendered on failed credential validation.
- The submission comments JSON file's structure has been modified to fit the new
README
- Updated the Comments section to reflect new changes to comments scraper UI.
- Repository documents
- Updated
How to Get PRAW Credentials.md
to reflect new changes.
- Updated
- Tests
- Updated CLI usage and examples tests.
- Updated
c_fname()
test because submission comments scrapes now follow a different naming convention.
Deprecated
- User interface
- Specifying
0
comments does not only export all comments to raw format anymore. Defaults to structured format.
- Specifying
- Source code
- Deprecated many global variables defined in
Global.py
:eo
options
s_t
analytical_tools
Credentials.py
has been replaced with the.env
file.- The
LogError.log_login
decorator has been deprecated due to the refactor withinValidation.py
.
- Deprecated many global variables defined in