-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llm : add llm_build_context #3881
Conversation
I don't know how much my opinion is really worth since my C++ ability is pretty limited (I never even touched C++ before I started contributing here). From what I've looked at/understand of it though, it looks like a good change. The only thing that seems weird to me is having the specific model build functions part of the context object. To me, a context seems like a pretty general concept while building a certain type of a model is a really specific thing. If I was making the change, what I'd probably do is just have a general build method that doesn't do anything in the base context object (or use C++ stuff to make it so it has to be supplied later) and then have the specific model types inherit from the base context type and implement whatever is necessary to build that specific type of model. Then you'd just make a |
I can see the motivation for doing this, it removes a lot of clutter from the model build functions and makes them very clear, but I also think that this also moves us further away from what I was suggesting. As you and @monatis had pointed, the current implementation could serve as the base of a functional API, from which an object oriented API could be built upon, but moving the functions into a class is a step away from that. Maybe we could keep only model build functions in |
Yeah, I had doubts about moving them inside the context as well and chose to move them just to reduce the code duplication, but it's probably not a good enough reason. Will move them back as static functions and keep just the model building functions in the context. |
Moved the helper functions outside the Will merge this later today if there are no other suggestions |
d056d62
to
78186f4
Compare
* llm : add llm_build_context * llm : deduce norm eps based on type + explict max_alibi_bias, clamp_kqv * llm : restore the non-graph llm_build_ functional API ggml-ci * llm : cleanup + comments
ref #3382
Trying to implement some ideas from:
struct llm_build_context
now wrapsllama_context
andllama_batch
and binds members to commonly used constants to avoid defining them for each graph. The non-graphllm_build_
functions (#3848) are kept as static functional-style API and is used by the graph building functions.@slaren @KerfuffleV2
Do you think this is a good change?