Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llm : add llm_build_context #3881

Merged
merged 4 commits into from
Nov 1, 2023
Merged

llm : add llm_build_context #3881

merged 4 commits into from
Nov 1, 2023

Conversation

ggerganov
Copy link
Owner

@ggerganov ggerganov commented Nov 1, 2023

ref #3382

Trying to implement some ideas from:

struct llm_build_context now wraps llama_context and llama_batch and binds members to commonly used constants to avoid defining them for each graph. The non-graph llm_build_ functions (#3848) are kept as static functional-style API and is used by the graph building functions.

@slaren @KerfuffleV2
Do you think this is a good change?

@KerfuffleV2
Copy link
Collaborator

I don't know how much my opinion is really worth since my C++ ability is pretty limited (I never even touched C++ before I started contributing here). From what I've looked at/understand of it though, it looks like a good change.

The only thing that seems weird to me is having the specific model build functions part of the context object. To me, a context seems like a pretty general concept while building a certain type of a model is a really specific thing.

If I was making the change, what I'd probably do is just have a general build method that doesn't do anything in the base context object (or use C++ stuff to make it so it has to be supplied later) and then have the specific model types inherit from the base context type and implement whatever is necessary to build that specific type of model. Then you'd just make a build_context_llama object or build_context_baichuan object and call .build() on it.

@ggerganov ggerganov added the refactoring Refactoring label Nov 1, 2023
@slaren
Copy link
Collaborator

slaren commented Nov 1, 2023

I can see the motivation for doing this, it removes a lot of clutter from the model build functions and makes them very clear, but I also think that this also moves us further away from what I was suggesting. As you and @monatis had pointed, the current implementation could serve as the base of a functional API, from which an object oriented API could be built upon, but moving the functions into a class is a step away from that.

Maybe we could keep only model build functions in llm_build_context, with all the common member variables to reduce the clutter, and keep the helper functions separate?

@ggerganov
Copy link
Owner Author

Maybe we could keep only model build functions in llm_build_context, with all the common member variables to reduce the clutter, and keep the helper functions separate?

Yeah, I had doubts about moving them inside the context as well and chose to move them just to reduce the code duplication, but it's probably not a good enough reason. Will move them back as static functions and keep just the model building functions in the context.

@ggerganov ggerganov marked this pull request as ready for review November 1, 2023 13:35
@ggerganov
Copy link
Owner Author

Moved the helper functions outside the llm_build_context and also they no longer require llama_context and llama_cparams. They can optionally take llama_model, llama_hparams, llama_batch and llama_kv_cache

Will merge this later today if there are no other suggestions

@ggerganov ggerganov merged commit 5033796 into master Nov 1, 2023
olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023
* llm : add llm_build_context

* llm : deduce norm eps based on type + explict max_alibi_bias, clamp_kqv

* llm : restore the non-graph llm_build_ functional API

ggml-ci

* llm : cleanup + comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactoring Refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants