-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow single token to be annotated by multiple entities in test stories #10773
Conversation
and removed debug output
…raoulvm/rasa into 2.8.14-test_multi_entities_only
to pass `poetry run flake8 rasa tests --extend-ignore D`
"D415 First line should end with a period, question mark, or exclamation point" With best regards to @erohmensing ...
to work with overlapping entities.
redoing canceled test at https://github.com/RasaHQ/rasa/runs/4332261715?check_suite_focus=true
though I do not have changed that code block
…raoulvm/rasa into 2.8.14-test_multi_entities_only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first few changes...
The model regression tests have started. It might take a while, please be patient. Used configuration can be found in the comment. |
Great work, thanks for your effort! |
Hi @JEM-Mosig I had a look at the regression tests... The data for those tests seem to include invalid data, such as UserWarning: Misaligned entity annotation in message 'set the time zone to g. m. t.' with intent 'datetime_convert'. Make sure the start and end values of entities ([(21, 29, 'g. m. t.')]) in the training data match the token boundaries ([(0, 3, 'set'), (4, 7, 'the'), (8, 12, 'time'), (13, 17, 'zone'), (18, 20, 'to'), (21, 22, 'g'), (24, 25, 'm'), (27, 28, 't')]). Common causes:
I searches the repo for those test strings but didn't see them in |
So the warnings come from here, but these are related to other issues that also appear in the released versions. The tests fail due to out of memory problems, however. 🤦♂️ |
docs/docs/testing-your-assistant.mdx
Outdated
@@ -323,6 +344,20 @@ your trainable entity extractors are trained to recognize. | |||
Only trainable entity extractors, such as the `DIETClassifier` and `CRFEntityExtractor` are | |||
evaluated by `rasa test`. Pretrained extractors like the `DucklingHTTPExtractor` are not evaluated. | |||
|
|||
If you have multiple entity extractors in your pipeline, or use some custom extractors, | |||
it is possible that multiple entities are associated with the same token. In this case, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"multiple entities can be associated"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs changes LGTM, left on comment on a bit of phrasing that seemed awkward but isn't obligatory to change. Otherwise very clear.
Ok, looks like there is one last test failing. I'll try to fix that today. Afterwards, I'd like another review from @tayfun. In particular, I had to add two docstrings for classes that are not in the research codebase, eventhough I didn't touch them (the linter complains otherwise). |
I am not sure you'll be able to fix this worker crash by changing the code...
And it only crashed with Python 3.7 environment. |
GitHub is broken and we can't open the Files tab for reviews (it yields an Error 500). I reached out to GitHub. But if it's not fixed by tomorrow morning I'll create a separate PR to work around this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You broke Github but 💯
I'd never expected this to be such a braking change. |
It works again! (I notified Tayfun and Thomas on Slack) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only have a small (and optional) suggestion, otherwise LGTM.
This continues the work from here: #10394
Proposed changes:
Status (please check what you already did):
black
(please check Readme for instructions)