Differences between Spacy3 and Spacy2 in NER #12246

ambuje · 2023-02-07T17:36:39Z

ambuje
Feb 7, 2023

Hi,

I saw a large improvement in the spacy3 models compared to the spacy2 model.

I am using the same default model (eg ja_core_news_lg) and observing that I am getting an accuracy of around 98 in spacy 3 (training using config) and was getting an accuracy of less than 90 in spacy 2 (training using python code)

I have trained a lot of languages and have observed the same that spacy3 is giving f1 about 95.

Seeing these extraordinary results, I just want to confirm if I am missing something or if anything major has been changed from spacy2 to spacy3.3 in terms of architecture. Or you could help me with the reasons for getting so much better accuracy from spacy3

Thank you,
Ambuje

Answered by rmitsch

Feb 8, 2023

Hi @ambuje, glad to hear that v3 improved your results!

It's difficult to pinpoint the causes for these improvements exactly, as this also depends on your data and your implementation of the training loop in v2. Some relevant changes from v2 to v3 include retrained models, automatic early stopping in v3's training loop implementation used by the train CLI, tokenization changes, different defaults etc.

View full answer

rmitsch · 2023-02-08T13:37:23Z

rmitsch
Feb 8, 2023
Maintainer

Hi @ambuje, glad to hear that v3 improved your results!

It's difficult to pinpoint the causes for these improvements exactly, as this also depends on your data and your implementation of the training loop in v2. Some relevant changes from v2 to v3 include retrained models, automatic early stopping in v3's training loop implementation used by the train CLI, tokenization changes, different defaults etc.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differences between Spacy3 and Spacy2 in NER #12246

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Differences between Spacy3 and Spacy2 in NER #12246

ambuje Feb 7, 2023

Replies: 1 comment

rmitsch Feb 8, 2023 Maintainer

ambuje
Feb 7, 2023

rmitsch
Feb 8, 2023
Maintainer