-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support SYCL backend windows build #5208
support SYCL backend windows build #5208
Conversation
compilelog.txt |
Using device 0 (Intel(R) Arc(TM) A770 Graphics) as main device
ggml_vulkan: Using Intel(R) Arc(TM) A770 Graphics | fp16: 1 | warp size: 32
Using device 0 (Intel(R) Arc(TM) A770 Graphics) as main device
ggml_vulkan: Using Intel(R) Arc(TM) A770 Graphics | fp16: 1 | warp size: 32
sycl FP32 I want to note that during the benchmark, while running the SYCL pass, the computer remained very responsive, unlike Vulkan, which resulted in increased fan noise and mouse cursor freezes. It seems the load was higher in the latter case. update: unfortunately, when using SYCL, all models generate gibberish |
That's happening on Linux too. not sure when it broke, but it did. That being said, perf at least looks similar to linux, which is good. |
hi @Jacoby1218 I don't know the standard test method. I ran some cases manually: gta@DUT109DG2MRB:~/llama.cpp/build$ GGML_SYCL_DEVICE=0 ./bin/main -m ~/llama-2-7b.Q4_K_S.gguf -p "Once upon a time, there existed a little girl, who liked to have adventures. She wanted to go to places and meet new people, and have fun" -n 128 -e -ngl
33 --no-mmap
...
Once upon a time, there existed a little girl, who liked to have adventures. She wanted to go to places and meet new people, and have fun. She was never quite happy with her life the way it was, because she knew that there were bigger things out there waiting for her.
The problem was that when you’re ten years old, you don’t know how to find these adventures on your own, or how to ask for them. She often told stories of what she wanted and needed, but no one listened. They just said “Oh, you’ll be fine”, or “You have a good life here.”
That was what they all thought: that she had a good life here, with her friends in the neighborhood, with the games and to
llama_print_timings: load time = 10103.39 ms
llama_print_timings: sample time = 21.70 ms / 128 runs ( 0.17 ms per token, 5899.71 tokens per second)
llama_print_timings: prompt eval time = 757.28 ms / 33 tokens ( 22.95 ms per token, 43.58 tokens per second)
llama_print_timings: eval time = 5357.27 ms / 127 runs ( 42.18 ms per token, 23.71 tokens per second)
llama_print_timings: total time = 6189.14 ms / 160 tokens
Log end can you share how to reproduce? |
Thank you for your sharing! What's the input in your test case with gibberish? |
Please clean the compile env before try again. |
@sorasoras @characharm @Jacoby1218 I think the build works correctly, please let us know if you face issues in compilation or building. Could you share the test cases which you were running ? |
In case anyone is experiencing the cmake error |
@sorasoras Ahh, exactly what you encountered in compilelog.txt. Have a try with CCACHE disabled. |
Using the server yields the same result. I've tried various models with different parameter counts and quantization methods. in Intel oneAPI command prompt for Intel 64 for Visual Studio 2022 /bin/main -m I:\deepseek-coder-33B-base-GGUF\deepseek-coder-7b-instruct-v1.5-Q8_0.gguf -p "Once upon a time, there existed a little girl, who liked to have adventures. She wanted to go to places and meet new people, and have fun" -n 128 -e -ngl 33 --no-mmap Log startmain: build = 2036 (47cba0d) system_info: n_threads = 6 / 12 | AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | Once upon a time, there existed a little girl, who liked to have adventures. She wanted to go to places and meet new people, and have fun"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" |
I test with the same input, there is corrent output.
|
was able to trigger this reliably, #5250 |
That works for me lol. |
confirm #5250 |
* support SYCL backend windows build * add windows build in CI * add for win build CI * correct install oneMKL * fix install issue * fix ci * fix install cmd * fix install cmd * fix install cmd * fix install cmd * fix install cmd * fix win build * fix win build * fix win build * restore other CI part * restore as base * rm no new line * fix no new line issue, add -j * fix grammer issue * allow to trigger manually, fix format issue * fix format * add newline * fix format * fix format * fix format issuse --------- Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
* support SYCL backend windows build * add windows build in CI * add for win build CI * correct install oneMKL * fix install issue * fix ci * fix install cmd * fix install cmd * fix install cmd * fix install cmd * fix install cmd * fix win build * fix win build * fix win build * restore other CI part * restore as base * rm no new line * fix no new line issue, add -j * fix grammer issue * allow to trigger manually, fix format issue * fix format * add newline * fix format * fix format * fix format issuse --------- Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
Support SYCL backend windows build.
Update the guide for windows build & usage.
Add CI for Windows SYCL build.