Release John Snow Labs Spark-NLP 1.8.4: Chunk annotators match content by sentence, sentences include id · JohnSnowLabs/spark-nlp

This release is meant to push downstream a few improvements from 2.0.x to the 1.8.x branch, mostly with the objective of keeping the stable branch line stable, and solving a few serious issues that were pending. This makes 1.8.4 an ideal version for stable deployments.

Enhancements

CHUNK type annotators now match content within sentence bounds, improves accuracy
Improved CHUNK type annotators to include sentence index information in metadata. May be used to improve matching accuracy.
Doc2Chunk annotator now has new params to failOnMissing, lowerCase match or startCol is token indexed
SentenceDetector and DeepSentenceDetector now disabled maxLength by default, also works appropriately to split in whitespaces
SentenceDetector include in metadata they sentence id

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

John Snow Labs Spark-NLP 1.8.4: Chunk annotators match content by sentence, sentences include id

Enhancements