-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen1.5-MoE-A2.7B-Chat微调GPU利用率很低 #275
Comments
部署时遇到CUDA extension not installed。并且推理速度特别慢。各位大神如何解决? |
可能环境和CUDA版本不匹配,可能显存不够 |
能否给我一份finetune的数据集jsonl文件 |
遇到了类似问题,lora sft 使用的是llama-factory 训练框架,环境信息如下:
|
This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread. |
关于这个,我这发现有一种情况会出现这现象。 设置了统一的随机数种子就好了。 |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
LoRA指令微调,deepspeed设置为zero2,GPU利用率基本在30%~40%左右,已在AutoConfig里设置了
output_router_logits=True
。非MoE模型正常。运行环境:

除了利用率低,之前还出现过一个问题:Qwen1.5-MoE-A2.7B-Chat训练到80多steps时卡住,GPU利用率突然到99%,然后就一直保持这个状态。运行环境除了
output_router_logits=True
没有设置外,其他都一样。设置了output_router_logits=True
后正常运行。The text was updated successfully, but these errors were encountered: