Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mra models have nan in output last_hidden_states #29373

Closed
4 tasks
zucchini-nlp opened this issue Feb 29, 2024 · 1 comment
Closed
4 tasks

Mra models have nan in output last_hidden_states #29373

zucchini-nlp opened this issue Feb 29, 2024 · 1 comment

Comments

@zucchini-nlp
Copy link
Member

System Info

When adding tests for batching equivalence (PR), it was found that Mra model has nan in outputs. I found the exact place where this happens, after the SparseDenseMatmul in this line, tensors start containing nan values.

When running tests for Mra, some configurations have to be tweaked as noted in this comment

Who can help?

@amyeroberts , tagging you here for tracking

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Model tests

Expected behavior

Find out why we are having nan

@amyeroberts
Copy link
Collaborator

Thanks for opening @zucchini-nlp and digging into this in depth.

In particular, linking to the relevant comment from the tests. Based on that, I don't think there's really much we can do here - having nans was a compromise for MRA in order to have a faster running test suite. With that in mind, I think we can close the issue, and just skip the batching test for MRA, linking to this issue in the reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants