Help with Text Classification Documentation Recipe #5224

ClaudMor · 2020-03-28T17:36:02Z

Hello,

From the documentation, there are two points which are not clear to me:

In the context of text classification, could the textcat pipe benefit from being preceded by other spacy pipes (e.g. sentencizer, ner, tok2vec, etc)? If so, how?
Is there any recommended way of using predictions coming separately from text classification oriented spaCy's features (like textcat and spaCy's word vectors) to improve performance?

I take the chance to also cite a Stack Overflow issue I wrote, which I hope could help many non-programmers like me approaching spaCy to go beyond the basics.

Hope this is the right place to ask, thanks in advance for any help.

The text was updated successfully, but these errors were encountered:

svlandeg · 2020-03-30T07:39:00Z

Hi @claudio20497 : These kind of questions are probably better kept at StackOverflow, where this is a larger community that can help. We don't always have the bandwidth to review specific use-cases, and we'd like to keep this issue tracker focused specifically on bug reports and feature requests.

That said, to answer your first question: The textcat pipeline currently does not consider features from previous pipes like NER or tagging. For spacy v.3.0 however, we're revamping the library so that you'll be able to change the ML models by just providing a different configuration file. This means you'll also be able to easily swap in a different tok2vec component. We're currently working on this on the develop branch. You might be interested in looking at this PR: #5143, which gives you an idea of where we're heading.

In the meantime, you may also be interested in this: https://github.com/explosion/spacy-transformers/blob/master/examples/train_textcat.py

I'll close this issue for now as there is no real action point on our side. I'll try to have a look at your SO post later this week.

lock · 2020-05-05T21:47:31Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

svlandeg added feat / textcat Feature: Text Classifier usage General spaCy usage labels Mar 30, 2020

svlandeg closed this as completed Mar 30, 2020

lock bot locked as resolved and limited conversation to collaborators May 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help with Text Classification Documentation Recipe #5224

Help with Text Classification Documentation Recipe #5224

ClaudMor commented Mar 28, 2020

svlandeg commented Mar 30, 2020 •

edited

Loading

lock bot commented May 5, 2020

Help with Text Classification Documentation Recipe #5224

Help with Text Classification Documentation Recipe #5224

Comments

ClaudMor commented Mar 28, 2020

svlandeg commented Mar 30, 2020 • edited Loading

lock bot commented May 5, 2020

svlandeg commented Mar 30, 2020 •

edited

Loading