Using SpaCy's Name Entity Recognition for large scale #13461
Unanswered
kostaDimitrijevic
asked this question in
Help: Coding & Implementations
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello my project wants to use SpaCy's NER model for processing millions of text documents. I'm using Databricks, PySpark and I have a computational power needed, but never the less with the current code I'm unable to run this on large scale. I am using all of the Sparks benefits but It always ends with Out Of Memory errors even when I'm using Nodes with huge amount of Memory. So that leads me to believe that my code isn't well written for large scale processing.
I can see that in your documentation your mentioning these potential solutions:
Are these solutions something I should try to implement? Can you give me some advices and tips for my use case. What is the best approach. Can this be used with PySpark, etc.
Beta Was this translation helpful? Give feedback.
All reactions