-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Isolate generic preprocessing functions #3180
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the cleanup! I left some minor style comments in the review.
@mpenkov how do you move tickets and PRs to your columns, in https://github.com/RaRe-Technologies/gensim/projects/9? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot!
Sorry for the late reply. First, add the ticket to the project (see Projects on the right side of this PR screen, next to milestones, etc.) and then select a column to add to. |
@rock420 Thank you for cleaning this up! I've merged your changes. @piskvorky Do we want to include this in the change log for 4.1.0? If yes, please edit the CHANGELOG.md on develop HEAD. |
Yes. Let's do it as part of the release or another PR, to avoid editing
|
* Move preprocessing functions from textcourpus module * Move preprocessing functions from lowcorpus module * Add test cases for preprocessing functions * Fix styling issues * Refactor remove_stopwords() and strip_short() * make tests pass * rm unused import Co-authored-by: Michael Penkov <m@penkov.dev>
* Move preprocessing functions from textcourpus module * Move preprocessing functions from lowcorpus module * Add test cases for preprocessing functions * Fix styling issues * Refactor remove_stopwords() and strip_short() * make tests pass * rm unused import Co-authored-by: Michael Penkov <m@penkov.dev>
Fixes #3171
PR Description -
gensim.corpora.textcorpus
andgensim.corpora.lowcorpus
have been moved togensim.parsing.preprocessing
module.remve_stopwords
function consistent.