-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong context preperation in SequenceTagger.predict() #3024
Comments
Hello @alzoubi36 thanks for noticing this - it is indeed a big problem that needs to be fixed. The sorting is done to speed-up inference, but when document context needs to be inferred on the fly, the new ordering messes up the contexts. The simplest solution would be to remove the sorting, but this would lead to slow-downs for models that don't require context. We have to think how to best address this. |
I think a hotfix could be to first call the embedding and then run the predictions, e.g.
or to just set the context beforehands:
but I agree that this is something to be fixed in the library. @alanakbik Maybe we can move that logic to the .predict method, such that the embeddings always expect the context to be set already? |
That would be a solution, however it would mean adding this logic to every predict method in Flair (the |
This is fixed by the PR above |
Hi everyone,
I finetuned a TransformerWordEmbeddings based on 'microsoft/mdeberta-v3-base' for a NER task and set use_context=True. I noticed some discrepancy between the test that happens automatically after the training (implemented by flair) and the tests that I do with SequenceTagger manually afterwards. After debugging I noticed that the sentences in a batch are being reordered in SequenceTagger.predict():
flair/flair/models/sequence_tagger_model.py
Line 465 in 5a13598
So the context that is being expanded from this batch is wrong. While the context that is being prepared in the automatic test after training is correct (no sentence reordering happens and SequenceTagger._prepare_tensors(batch) is directly called).
Can you let me know if my observation is correct or if I missed something?
The text was updated successfully, but these errors were encountered: