Skip to content

NER on whole paragraphs, instead of just keywords #12359

Discussion options

You must be logged in to vote

The NER model isn't really intended for longer texts like this, and this sounds like a good use case for spancat, which you could use to directly label all paragraphs in a text.

You'd need a solid working definition of "paragraph" so that you can annotate the same units as paragraphs in your training data and also suggest those exact same units for new texts with a custom suggester function for your spancat component.

I'm surprised that I didn't easily find this kind of suggester somewhere in our projects, but here's a third-party example that suggests sentences, which should be similar in practice to suggesting paragraphs:

https://github.com/thiippal/MoodCat/blob/867438444fd3c0d1cae3e680…

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@goonhoon
Comment options

@adrianeboyd
Comment options

@goonhoon
Comment options

@adrianeboyd
Comment options

Answer selected by goonhoon
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / ner Feature: Named Entity Recognizer feat / spancat Feature: Span Categorizer
2 participants