Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic entropy is using probabilites greater than 1 #223

Open
gyampols opened this issue Aug 20, 2024 · 2 comments · May be fixed by #227
Open

Semantic entropy is using probabilites greater than 1 #223

gyampols opened this issue Aug 20, 2024 · 2 comments · May be fixed by #227
Assignees
Labels
bug Something isn't working

Comments

@gyampols
Copy link

For semantic entropy they were using the classwise probability as defined by
image
Here is an example of that calculation from the same paper.
image
However, the way this is calculating it, they are adding up all the sample texts outputted, but not taking into account that often sample texts repeat which gets you probabilites greater than 1.
For example, lets say the model outputs 5 outputs. ['Paris','Paris','Paris','Its Paris','London'] with the following likelihoods [0.6,0.6,0.6,0.3,0.1]. Based on the way this library was calculating it theyd get the probability of the first class as 0.6+0.6+0.6+0.3=2.1 and the second class as 0.1. But how can that first class be a probability greater than 1? It shouldn't be because it should be only adding non-repeating classes together. So since those first three outputs are the same then the class probabilites should be 0.9 and 0.1.

In the code you can see it in semantic_entropy.py inside the estimators folder.
for i in range(len(hyps_list)): class_likelihoods = [ np.array(loglikelihoods_list[i])[np.array(class_idx)] for class_idx in self._class_to_sample[i] ] class_lp = [ np.logaddexp.reduce(likelihoods) for likelihoods in class_likelihoods ] if log_weights[i] is None: log_weights[i] = [0 for _ in hyps_list[i]] semantic_logits[i] = -np.mean( [ class_lp[self._sample_to_class[i][j]] * np.exp(log_weights[i][j]) for j in range(len(hyps_list[i])) ] )
class_lp portion is summing all outputs in each class instead of all unique outputs in each class.
This means that the more outputs you generate the larger the uncertainty will get.

@rvashurin
Copy link
Collaborator

Hi, @gyampols! You are right, we will release fix for this shortly.

Thanks for your report!

@rvashurin rvashurin self-assigned this Aug 22, 2024
@rvashurin rvashurin added the bug Something isn't working label Aug 22, 2024
@IINemo
Copy link
Owner

IINemo commented Aug 23, 2024

@gyampols thank you for your finding. Indeed we use the official implementation from the original paper. This approach has a pointed out "bug". The question is whether this "bug" leads to better results or not.

We will implement the "corrected" version in addition to the original one and conduct the experiments. If you have already tested the "corrected" version, please share the results.

@rvashurin rvashurin linked a pull request Aug 26, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants