forked from piskvorky/gensim
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add the 'keep_tokens' parameter to 'filter_extremes' (piskvorky#1210)
* Add the 'keep_tokens' parameter to 'filter_extremes' and test it Add the optional 'keep_tokens' parameter to the 'filter_extremes' method in dictionary.py. This parameter can contain a list of tokens, which will be kept regardless of the 'no_below' and 'no_above' settings. This can be useful if the research goal is to enforce certain tokens to appear in topics, and still be able to filter all other extremes. If 'keep_tokens' is not given, the functionality of 'filter_extremes' is unchanged. Unit tests are also provided to assert examples of the above. * Create good_ids only once Create good_ids only once as per optimization suggestion, regardless if 'keep_tokens' is provided or not.
- Loading branch information
Tomasz Oliwa
authored and
Pranaydeep Singh
committed
Mar 21, 2017
1 parent
5235637
commit 8fac6f1
Showing
2 changed files
with
34 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters