Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何进行开发 #4

Open
wangshuai09 opened this issue Jun 27, 2024 · 7 comments
Open

如何进行开发 #4

wangshuai09 opened this issue Jun 27, 2024 · 7 comments

Comments

@wangshuai09
Copy link
Collaborator

wangshuai09 commented Jun 27, 2024

# fork官方仓库 https://github.com/ggerganov/llama.cpp.git 并下载项目至本地
git clone git@github.com:{your_own}/llama.cpp.git

# 进入项目,从master分支创建个人开发分支
cd llama.cpp
git checkout -b local_npu_support

# 编译
mkdir build 
cd build 
cmake .. -DCMAKE_BUILD_TYPE=debug -DLLAMA_CANN=on && make -j32

# 单算子精度测试
./bin/test-backend-ops test -b CANN0 -o {OP_NAME}
# e.g. 
./bin/test-backend-ops test -b CANN0 -o CONT

# 单算子性能测试,性能测试不会测试精度
./bin/test-backend-ops perf -b CANN0 -o {OP_NAME}

# 模型推理
./bin/llama-cli -m /home/wangshuai/models/hermes_gguf/Hermes-2-Pro-Llama-3-8B-F16.gguf -p "Building a website can be done in 10 simple steps:" -ngl 32 -sm none -mg 0 -t 0

官方仓库贡献代码

欢迎广大开发者在模型支持及设备支持上贡献代码,当前支持的模型及设备列表请参考:ggerganov#8867
PR题目请添加 [CANN] 前缀,commit 信息使用 cann: commit message 的格式,reviewer: @hipudding,@wangshuai09
具体请参考:ggerganov#8822

@Starrylun
Copy link

Starrylun commented Jul 22, 2024

ascend 310
problem:
I cannot use cann npu to run gguf qwen 0.5b.
大佬,你好,我想请教一下,如何在 ascend 310 设备上使用 npu 部署 qwen0.5b.gguf

my device info is :

+--------------------------------------------------------------------------------------------------------+
| npu-smi 23.0.0                                   Version: 23.0.0                                       |
+-------------------------------+-----------------+------------------------------------------------------+
| NPU     Name                  | Health          | Power(W)     Temp(C)           Hugepages-Usage(page) |
| Chip    Device                | Bus-Id          | AICore(%)    Memory-Usage(MB)                        |
+===============================+=================+======================================================+
| 2       310                   | OK              | 12.8         53                0     / 969           |
| 0       0                     | 0000:1A:00.0    | 0            594  / 7759                             |
+-------------------------------+-----------------+------------------------------------------------------+


root@vanrui:/home/data/A000Files/A002AI-LMdeploy/ascend-llama.cpp/llama.cpp# cat /usr/local/Ascend/driver/version.info
Version=23.0.0
ascendhal_version=7.35.19
aicpu_version=1.0
tdt_version=1.0
log_version=1.0
prof_version=2.0
dvppkernels_version=1.1
tsfw_version=1.0
Innerversion=V100R001C15SPC002B224
compatible_version=[V100R001C29],[V100R001C30],[V100R001C13],[V100R001C15]
compatible_version_fw=[7.0.0,7.1.99]
package_version=23.0.0

I use test-backend-ops , but got this

(qwenCpp) root@vanrui:/home/data/A000Files/A002AI-LMdeploy/ascend-llama.cpp/llama.cpp/build# ./bin/test-backend-ops perf -b CANN0 -o {OP_NAME}
ggml_backend_register: registered backend CPU
Testing 1 backends

Backend 1/1 (CPU)
  Skipping
1/1 backends passed
OK

这里测试完成之后,只有 cpu 设备,没有 npu 设备。造成无法使用 npu 进行加速

@hipudding
Copy link
Owner

310和910的接口还有一些出入,后面可以考虑做310的支持。目前还不能用

@Starrylun
Copy link

感谢回复 👍

@zhaohengxing
Copy link

你好,我在ascend310P3的设备上使用npu部署qwen2-1.5b-fp16.gguf,输出一直是乱码。
目前状态是:程序编译可以通过。npu算子也已经使能。但是test-backend-ops测试存在某些算子精度错误。
考虑到大概率是个别算子精度存在问题,我准备尝试进行调试和修复。
请问能否给出一些从哪里入手的具体指导,有哪些关键点需要注意?

@wangshuai09
Copy link
Collaborator Author

wangshuai09 commented Aug 14, 2024

@zhaohengxing ,精度不对有可能是某些算子未支持的qwen2的情况,或者某些特殊情况调用了错误的算子,也有可能是不同系列的芯片接口不一致,需要更改

  1. 确认qwen2调用了哪些算子;
  2. 跑一下这些算子的测试用例;
  3. 修复算子精度问题,可自行构造简单用例进行测试修复;
  4. 跑模型,如果还存在精度问题,就把模型的最后一层放在npu上,与cpu的结果进行对比,看调用到哪个算子后出现了精度问题

@hipudding
Copy link
Owner

hipudding commented Aug 14, 2024

@zhaohengxing 对310p的适配代码建议提交PR,一起看看。 用例精度问题,需要针对有问题的算子debug解决精度问题,解决后应该就可以正常推理了

@zhaohengxing
Copy link

多谢了!我尝试先找下问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants