-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama-cpp: pull upstream flake changes #289513
Conversation
18cefab
to
b71cce1
Compare
# Should likely use `rocmPackages.clr.gpuTargets`. | ||
"-DAMDGPU_TARGETS=gfx803;gfx900;gfx906:xnack-;gfx908:xnack-;gfx90a:xnack+;gfx90a:xnack-;gfx940;gfx941;gfx942;gfx1010;gfx1012;gfx1030;gfx1100;gfx1101;gfx1102" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is from the upstream flake, I guess it's ok to just "pull those changes", but consider following up on the suggestion from the comments in a separate commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the xnack- present on some of these would make it difficult to abstract away
Happy to follow up with a separate PR when this has been resolved exactly and tested.
b71cce1
to
b48e8c3
Compare
b48e8c3
to
0fb62fd
Compare
see also #287554 -- think there might be some overlap. |
Thank you for letting me know, ill let them know. |
Thanks for asking me about this PR @happysalada, and about moving forward with this instead of mine. If you are making such significant changes to a package, please give a short written explanation somewhere as part of the PR which explains what the reasons for and benefits of that change are. Saying discussions have been had with person X and Y helps a bit, but by itself it maybe makes the process feel even more opaque to readers of the PR, which I don't think should be acceptable. When I look at this PR I feel kind of left/locked out of the process. As is, by getting rid of that custom installation step,
The other side of that issue of renaming binaries is that with whats currently on master it's actually really annoying to switch between something like ollama with llama.cpp from the upstream flake and nixpkgs, because of the different binary names. If you think about renaming all of the outputs, it would be great if you could look into addressing that issue as well. |
# upstream plans on adding targets at the cmakelevel, remove those | ||
# additional steps after that | ||
postInstall = '' | ||
mv $out/bin/main $out/bin/llama |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this breaks meta.mainProgram
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really annoying to switch between something like ollama with llama.cpp from the upstream flake and nixpkgs, because of the different binary names
To be fair, it was the upstream's choice to use non-composable/generic names (these won't do for nixpkgs, imo). 👍🏻 that the name should be consistent with whatever meta.mainProgram
is kept
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this?
mv $out/bin/main $out/bin/llama | |
mv $out/bin/main $out/bin/${finalAttrs.finalPackage.meta.mainProgram} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're removing the /lib and /include folders
I didn't notice that, where? Libraries and headers are installed by CMake
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I built it locally and went by the output of tree
to check. Embarrassingly I read that wrong. I will update my comment.
Actually the previous version commit only has /bin
, and additionally /lib
for libggml_shared.so
and libllama.so
with .override{ static=false; }
.
This PR gets rid of the flag and has some extra stuff in lib
and adds a /include
:
result
├── bin
│ ├── baby-llama
│ ├── batched
│ ├── batched-bench
│ ├── beam-search
│ ├── benchmark
│ ├── convert-llama2c-to-ggml
│ ├── convert-lora-to-ggml.py
│ ├── convert.py
│ ├── embedding
│ ├── export-lora
│ ├── finetune
│ ├── gguf
│ ├── imatrix
│ ├── infill
│ ├── llama
│ ├── llama-bench
│ ├── llama-server
│ ├── llava-cli
│ ├── lookahead
│ ├── lookup
│ ├── parallel
│ ├── passkey
│ ├── perplexity
│ ├── quantize
│ ├── quantize-stats
│ ├── save-load-state
│ ├── simple
│ ├── speculative
│ ├── test-autorelease
│ ├── test-backend-ops
│ ├── test-grad0
│ ├── test-grammar-parser
│ ├── test-llama-grammar
│ ├── test-model-load-cancel
│ ├── test-quantize-fns
│ ├── test-quantize-perf
│ ├── test-rope
│ ├── test-sampling
│ ├── test-tokenizer-0-falcon
│ ├── test-tokenizer-0-llama
│ ├── test-tokenizer-1-bpe
│ ├── test-tokenizer-1-llama
│ ├── tokenize
│ └── train-text-from-scratch
├── include
│ ├── ggml-alloc.h
│ ├── ggml-backend.h
│ ├── ggml.h
│ └── llama.h
└── lib
├── cmake
│ └── Llama
│ ├── LlamaConfig.cmake
│ └── LlamaConfigVersion.cmake
├── libggml_shared.so
├── libllama.so
└── libllava_shared.so
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh I only checked the upstream flake yet. I'll have a look at both of the PRs during the week
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other side of that issue of renaming binaries is that with whats currently on master it's actually really annoying to switch between something like ollama with llama.cpp from the upstream flake and nixpkgs, because of the different binary names
That's an interesting thought 👍🏻 . Since we in principle can contribute both to the nixpkgs and to the upstream nix expressions, we could expose all of the relevant metadata in the the passthru
(or even in the outputs if needed) to accommodate smooth transition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put the package on their path, since it contains binaries with names like lookup and simple
To ensure this doesn't bite us in the future, we could propose a PR upstream adding a cmake option that prefixes all of the installed binaries (the default prefix being the empty string)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. 👍
ggerganov/llama.cpp#5311 has been merged and Vulkan works on macOS now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please integrate the changes from ggerganov/llama.cpp#5311, then we can test on x86_64-linux, x86_64-darwin, and aarch64-darwin, ensure it works, then check it in.
https://github.com/NixOS/nixpkgs/pull/289513/files#r1493855899 also needs to be addressed in the nixpkgs context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm maybe not following, but why integrating the Vulkan PR should be a blocker for merging this? I wouldn't mind if this was a separate PR, depending on whether it's easier to pull this together with or without the Vulkan enhancement.
@mschwaig's concerns need to be resolved indeed. Btw, @happysalada do you still have time for this?
My goal is that macOS Nix users have the ability to get the accelerated Vulkan build. As @mschwaig notes, with the current revision selected, vulkan needs a restriction that gates it to Linux only. Since this PR doesn't set Super simple stuff, I think and hope. |
Im happy to make the update at the same time, should have time for this saturday morning. Recently day job is crazy during the week. |
0fb62fd
to
86e127d
Compare
Alright I've updated to latest 2252, I've rephrased the PR text, and I've added vulcan in the the non broken logic. If anyone has any other comments on this PR please let me know. Separately I have a question for @SomeoneSerge . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting changes for the meta.mainProgram
thing and the meta.broken
and meta.badPlatforms
keys; a bunch of super small other things are also noted.
cuda_cccl.dev # <nv/target> | ||
|
||
# A temporary hack for reducing the closure size, remove once cudaPackages | ||
# have stopped using lndir: https://github.com/NixOS/nixpkgs/issues/271792 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's on the Roadmap 🔮!
in | ||
effectiveStdenv.mkDerivation (finalAttrs: { | ||
pname = "llama-cpp"; | ||
version = "2249"; | ||
version = "2252"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, we're already at 2275 just over the weekend 👀 . No need to adjust, but dang, this is why flakes enable Nix support in fast-moving repositories!
postPatch = '' | ||
substituteInPlace ./ggml-metal.m \ | ||
--replace '[bundle pathForResource:@"ggml-metal" ofType:@"metal"];' "@\"$out/bin/ggml-metal.metal\";" | ||
''; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated to your change: this is such an odd line and I wonder why the Nix build needs it. I'm no Metal-head, though.
buildInputs = optionals effectiveStdenv.isDarwin darwinBuildInputs | ||
++ optionals cudaSupport cudaBuildInputs | ||
++ optionals mpiSupport mpi | ||
++ optionals openclSupport [ clblast ] | ||
++ optionals rocmSupport rocmBuildInputs | ||
++ optionals vulkanSupport vulkanBuildInputs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gorgeous.
# upstream plans on adding targets at the cmakelevel, remove those | ||
# additional steps after that | ||
postInstall = '' | ||
mv $out/bin/main $out/bin/llama |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this?
mv $out/bin/main $out/bin/llama | |
mv $out/bin/main $out/bin/${finalAttrs.finalPackage.meta.mainProgram} |
@@ -146,7 +156,7 @@ effectiveStdenv.mkDerivation (finalAttrs: { | |||
license = licenses.mit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description also wants to be updated from the upstream: Inference of LLaMA model in pure C/C++
I like the pnameSuffix
and descriptionSuffix
things that upstream does as well, which I note you chose not to port over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a little lazy, and wasn't 100% sure it was going to be accepted. I wanted to make the smallest change possible. How about we do that in a follow up PR ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I have no strong feelings about it being in this PR.
86e127d
to
0317300
Compare
Alright I went through the comments and fixed what was easy/simple to fix and brought the update to 2294. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Result of nixpkgs-review pr 289513
run on x86_64-linux 1
1 package built:
- llama-cpp
@@ -146,7 +156,7 @@ effectiveStdenv.mkDerivation (finalAttrs: { | |||
license = licenses.mit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I have no strong feelings about it being in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @GrahamcOfBorg, I'm still saying yes.
@SomeoneSerge @philiptaron if any of you has time / availability to add a .pc file to llama-cpp ill be happy to try to propose using pkg-config to ollama . currently ollama is broken on nixpkgs because they want the lib files at specific places. Changing to pkg-config would fix that. |
Agree, but I am a complete novice when it comes to |
Description of changes
I've basically copied the upstream flake.
we had a discussion with @SomeoneSerge about removing the custom install step.
The idea was that using the normal cmake install step uses the upstream file. Whereas using a custom install step, we just loose upstream logic and we just do a worse job at it.
@elohmeier
I've kept the naming of cudaSupport instead of upstream usecuda since it's the norm in nixpkgs, but I don't feel that strongly about it.
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.