-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase versions, speed & update model downloading #10
Conversation
…-nlp-named-entity-recognition into fix/model-downloading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested on
- Python 37, could download flair model
- Python 37, webapp working accordingly
- Ran NER recipe on new / old plugin and Spacy model did run faster (1min37 vs 1min53)
@Muennighoff I did not have an issue re-building the code env for python 36 on a mac.. it worked with your requirements.txt file and used tokenizers==0.12.1
I think this PR is a good first step, however I need to mention a couple things :
So I'm ok with merging this PR since imo it's a net improvement over the existing situation (modulo the little comments I made), but I think we should be able to go sensibly further in the future. |
Cf #11 because I was curious :-D |
* Spacy optimizations - Disable unused pipeline algos - divides recipe time roughly by two - Allow multi-cpu processing - on a 8 core machine, divives recipe time roughly by 3 Overall, 6x improvement for spacy recipe !
Solved Issues:
ImportError: cannot import name 'Markup' from 'jinja2'
Remaining issues:
tokenizers==0.10.3
to the requirementsOther notes:
recipe.py
step, but then we'll need to catch incorrect namesDo you find a nice way to shorten any of the above for adding a new model?
Should we add the instructions & the version pin for Py36 to the plugin website?
Manual Tests:
👻👻👻