Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warnings for wrong values of model_confidence parameter #8014

Merged
merged 34 commits into from
Feb 26, 2021

Conversation

dakshvar22
Copy link
Contributor

@dakshvar22 dakshvar22 commented Feb 22, 2021

Proposed changes:

  • Disallow usage of model_confidence=cosine and model_confidence=inner in all ML components.
  • Introduce model_confidence=linear_norm as a replacement which uses a linear normalization scheme on top of dot product similarities.

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

@dakshvar22 dakshvar22 requested review from a team and joejuzl and removed request for a team February 22, 2021 13:25
@dakshvar22 dakshvar22 removed the request for review from joejuzl February 22, 2021 13:44
@dakshvar22 dakshvar22 changed the base branch from 2.2.x to 2.3.x February 22, 2021 13:44
@dakshvar22
Copy link
Contributor Author

dakshvar22 commented Feb 22, 2021

@akelad Could you review the documentation part? This PR is supposed to be merged in 2.3.x and hence it keeps cosine as a possible value in the code but raises warnings if used. The docs should basically discourage usage of cosine as a value.
In a follow up PR, cosine will be removed as a possible value altogether. That will be merged into master to be released in 2.4.

@dakshvar22 dakshvar22 requested a review from akelad February 22, 2021 14:02
Copy link
Contributor

@akelad akelad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few comments. I think it would also be good if you edit the old changelog to add a bit of a warning?

Copy link
Contributor

@Ghostvv Ghostvv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you renormalize linearly normalized inner confidences?

@dakshvar22
Copy link
Contributor Author

do you renormalize linearly normalized inner confidences?

No, the normalization outside is only done for softmax model confidences to preserve the old behaviour. Primarily because the normalization for softmax is done slightly differently where it takes ranking length into account and then normalizes, whereas for inner we want to normalize first and then apply ranking length if needed.

@dakshvar22 dakshvar22 requested a review from a team as a code owner February 26, 2021 08:09
@dakshvar22 dakshvar22 requested review from tczekajlo and removed request for a team February 26, 2021 08:09
Copy link
Contributor

@akelad akelad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few changes to docs pls

This parameter allows the user to configure how confidences are computed during inference. It can take two values:
1. `softmax`: Confidences are in the range `[0, 1]` (old behaviour and current default). Computed similarities are normalized with the `softmax` activation function.
2. `linear_norm`: Confidences are in the range `[0, 1]`. Computed dot product similarities are normalized with a linear function.
It is recommended to try using `linear_norm` as the value for `model_confidence` as it should be easier to tune thresholds for fallback then.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments here as above

- name: DIETClassifier
model_confidence: linear_norm
constrain_similarities: True
...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same edits to this whole thing as the comments I made on the changelog

```
- name: DIETClassifier
model_confidence: cosine
model_confidence: linear_norm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar edits in this paragraph - but I don't think we should take out all reference to cosine here. We should just add a warning that the below is no longer true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I added back the possibility of adding cosine but changed the recommendation to linear_norm. Added an EDIT at the end about the deprecation.

This parameter allows the user to configure how confidences are computed during inference. It can take two values:
1. `softmax`: Confidences are in the range `[0, 1]` (old behaviour and current default). Computed similarities are normalized with the `softmax` activation function.
2. `linear_norm`: Confidences are in the range `[0, 1]`. Computed dot product similarities are normalized with a linear function.
It is recommended to try using `linear_norm` as the value for `model_confidence` as it should be easier to tune thresholds for fallback then.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same edits here

dakshvar22 and others added 2 commits February 26, 2021 16:00
Docs review

Co-authored-by: Akela Drissner-Schmid <32450038+akelad@users.noreply.github.com>
@RasaHQ RasaHQ deleted a comment from github-actions bot Feb 26, 2021
@RasaHQ RasaHQ deleted a comment from github-actions bot Feb 26, 2021
@RasaHQ RasaHQ deleted a comment from github-actions bot Feb 26, 2021
@dakshvar22 dakshvar22 requested a review from akelad February 26, 2021 10:54
Copy link
Contributor

@akelad akelad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, but can you please still address the typoCI warnings? There's just a few instances of british vs american spelling

@dakshvar22 dakshvar22 merged commit 0df088c into 2.3.x Feb 26, 2021
@dakshvar22 dakshvar22 deleted the restrict_model_confidences branch February 26, 2021 13:40
@dakshvar22 dakshvar22 restored the restrict_model_confidences branch February 26, 2021 13:40
## Rasa 2.3.3 to Rasa 2.3.4

:::caution
This is a release **breaking backwards compatibility of machine learning models**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakshvar22 Couldn't we have made that one non-breaking? Seem less than ideal to do a model breaking change in a micro release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wochinge It wasn't really possible to do in a non-breaking way because that would have meant leaving the bug around for all 2.3.x versions. Here's the slack thread about it.

@ryanblakeley
Copy link
Contributor

@dakshvar22 hey, i saw this warning and wanted to resolve it but it took some digging to understand that the change needed to be made in 3 places: TEDPolicy, DIETClassifier, and ResponseSelector. do you think the warning should clarify which pipeline/policy is triggering the warning? i wasn't able to figure it out until i came across this https://forum.rasa.com/t/model-confidence-linear-norm/42481/3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants