Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] finetune w/o lora, inference get error: Negative code found: {codes} #370

Open
sairin1202 opened this issue Jul 10, 2024 · 5 comments
Labels
bug Something isn't working stale

Comments

@sairin1202
Copy link

不使用lora训练的ckpt,load后出现Negative code found: {codes}, 生成codes出现0,但是训练loss正常

如果使用lora训练后merge的模型,一切正常

@sairin1202 sairin1202 added the bug Something isn't working label Jul 10, 2024
@yy524
Copy link

yy524 commented Jul 11, 2024

我也遇到同样问题,请问您找到原因了吗?

@sairin1202
Copy link
Author

还没有。。

@Jielin-Qiu
Copy link

Same issue here. Pretrain from scratch or finetune (not lora) on new dataset --> Negative code found: {codes}

So far:

  • official weight --> ok
  • official weight + lora --> ok
  • our pretrain weight --> error
  • our pretrain weight + lora --error
  • official weight + finetune on our data (not lora) --> error

Some examples:

  • Pretrain from scratch - Step 50k: Audio
  • Finetune (not lora) - Step 10k: Audio

I am able to detect that there are indeed negative values:
Negative values found: tensor([-1, -1, -1, -1, -1, -1, -1, -1], device='cuda:0', dtype=torch.int32)

Not sure if it is an error in the training pipeline, guessing -1 shouldn't be decoded as output?

@liulangdeyeshou
Copy link

我前几天遇到了和你一样的问题,后来我改了config.json中的n_local_heads后就好了。我猜测是因为我改了网络结构后,部分参数对不上,导致了推理结果错误。我是在使用lora去重新finetune我训练的pretrain权重的时候发现的这个问题。你可以去尝试check下你的config.json里的参数。

Copy link
Contributor

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

No branches or pull requests

4 participants