Mra models have `nan` in output last_hidden_states #29373

zucchini-nlp · 2024-02-29T16:58:55Z

System Info

When adding tests for batching equivalence (PR), it was found that Mra model has nan in outputs. I found the exact place where this happens, after the SparseDenseMatmul in this line, tensors start containing nan values.

When running tests for Mra, some configurations have to be tweaked as noted in this comment

Who can help?

@amyeroberts , tagging you here for tracking

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Model tests

Expected behavior

Find out why we are having nan

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-03-04T11:10:09Z

Thanks for opening @zucchini-nlp and digging into this in depth.

In particular, linking to the relevant comment from the tests. Based on that, I don't think there's really much we can do here - having nans was a compromise for MRA in order to have a faster running test suite. With that in mind, I think we can close the issue, and just skip the batching test for MRA, linking to this issue in the reason.

zucchini-nlp closed this as completed Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mra models have `nan` in output last_hidden_states #29373

Mra models have `nan` in output last_hidden_states #29373

zucchini-nlp commented Feb 29, 2024

amyeroberts commented Mar 4, 2024

Mra models have nan in output last_hidden_states #29373

Mra models have nan in output last_hidden_states #29373

Comments

zucchini-nlp commented Feb 29, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

amyeroberts commented Mar 4, 2024

Mra models have `nan` in output last_hidden_states #29373

Mra models have `nan` in output last_hidden_states #29373