Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "fix: use /infill for llama.cpp code-completions (#513)" #533

Merged
merged 2 commits into from
May 8, 2024

Conversation

PhilKes
Copy link
Contributor

@PhilKes PhilKes commented May 7, 2024

This reverts commit 8de72b3.

As discussed in #510 this reverts the switch from /completion to /infill

…-llama-infill

# Conflicts:
#	src/main/kotlin/ee/carlrobert/codegpt/codecompletions/CodeCompletionRequestFactory.kt
@carlrobertoh carlrobertoh merged commit dcd0a3f into carlrobertoh:master May 8, 2024
2 checks passed
@CISC
Copy link

CISC commented May 8, 2024

Hey, just curious why ggerganov/llama.cpp#7102 (comment) prompted this?

carlrobertoh pushed a commit that referenced this pull request May 13, 2024
@PhilKes
Copy link
Contributor Author

PhilKes commented May 29, 2024

Hey, just curious why ggerganov/llama.cpp#7102 (comment) prompted this?

We use GGUF models from HF (e.g https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat-GGUF) which (afaik) do not have the special token modification from your PR (ggerganov/llama.cpp#7166), therefore we can't use them with /infill, and reverted back to using /completion for FIM prompts.

@CISC
Copy link

CISC commented May 29, 2024

You can use mine which does. :)

@PhilKes
Copy link
Contributor Author

PhilKes commented May 29, 2024

Thats nice, but we would need a working GGUF for all models that we support, the list is quite long (see HuggingFaceModel) with different FIM prompt templates (see InfillPromptTemplate) and its continously growing with new models coming up. We want the same solution for every existing and new model, so /infill is no option for us atm

@CISC
Copy link

CISC commented May 29, 2024

In that case I think you will have to maintain your own copies, which shouldn't be that hard though using gguf-new-metadata.py. I doubt anyone else is going to bother manually adding this metadata, and it's unlikely the conversion scripts will (it's not quite working properly on the ones it does add them to already (instruct/chat tuned models can lose fill-in-middle capability even though they still have the tokens)).

@PhilKes
Copy link
Contributor Author

PhilKes commented May 29, 2024

In that case I think you will have to maintain your own copies, which shouldn't be that hard though using gguf-new-metadata.py. I doubt anyone else is going to bother manually adding this metadata, and it's unlikely the conversion scripts will (it's not quite working properly on the ones it does add them to already (instruct/chat tuned models can lose fill-in-middle capability even though they still have the tokens)).

Thanks for your suggestion but its easier for us to just maintain the FIM templates in our project and using the /completion Endpoint, than maintaining our own GGUF copies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants