-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add export_phrases function to Phrase model (TODO: cythonize the primitive loops) #588
Conversation
…w versions of word2vec and doc2vec are being used
The .h file under models may not be included in system C++ compiler, |
Please add tests and CHANGELOG for the new functionality. Thanks. |
The cython-related commits seem unrelated to the headline title? |
…(of assertEquals).
…(of assertEquals).
@gojomo Should this be merged? |
There's still two unrelated things here: (1) Some sort of Windows-specific workaround for an issue with the cython code. I don't know who's been affected by that and have no way to evaluate if the change is a net improvement, since I don't build/use on Windows (and maybe this only affects some compilers/pythons, or it would have been an issue for others). It seems to me that change should get a separate PR. (2) A utility method to yield just the phrases and their scores from a block of sentences (instead of the usual mix of phrase-ified content). I don't know the exact motivation for that need – was it a feature request? But, the method |
@gojomo for 2, yes, it is a feature request, I'd like to dump phrase, score pair to do some rough manual review |
Re. item 1: we definitely don't want the |
@piskvorky good to know we have already-compiled C modules, How to make it work? Does below words in tutorials implies it will compile once during installation, what if user misses that point? Does he have to uninstall, make c compiler ready and install again? In my case, an on-demand runtime-compile does not seems to make things worse anyway :) P.S. I have not tried gensim on linux yet, which may mitigate all the trouble I met on windows hopefully : )
|
Yes, the C code is compiled once, during gensim install. The install outputs a warning in case the compilation fails, notifying the user that some functionality will be slow as a result. |
@piskvorky got it, how about the feature request of export_phrases? If feasible, I can revert cython related change |
Sure, that sounds good. Incidentally, the phrases module is a great adept at cythonization = rewriting the primitive loops in cython, compiling an (optional) C module so that it all runs faster. |
@hanabi1224 Thanks for the PR! |
No description provided.