-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docstrings for Wordrank #1378
Conversation
Just the style failures |
@@ -47,8 +47,12 @@ class Wordrank(KeyedVectors): | |||
@classmethod | |||
def train(cls, wr_path, corpus_file, out_name, size=100, window=15, symmetric=1, min_count=5, max_vocab_size=0, | |||
sgd_num=100, lrate=0.001, period=10, iter=90, epsilon=0.75, dump_period=10, reg=0, alpha=100, | |||
beta=99, loss='hinge', memory=4.0, cleanup_files=True, sorted_vocab=1, ensemble=0): | |||
beta=99, loss='hinge', memory=4.0, cleanup_files=False, sorted_vocab=1, ensemble=0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for change cleanup_file
to False
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup_files=False
will not delete the (word/context) embedding files and vocab file generated by wordrank during training, which are saved inside wordrank's directory . Though the train()
method loads the final required embedding file before deleting everything that was generated during training but it could be confusing to users who expect to find it after the training is finished.
So, making the default behavior to not delete them could be better to avoid confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please enumerate output files (filename and what the file contains) in docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the output filenames and content info. in out_name
param description because it is the directory which contain these files.
Thank you @parulsethi 👍 |
output_dir
param (previous one would give error at.load_wordrank_model()
which loads files fromoutput_dir
)