Revert "fix: use /infill for llama.cpp code-completions (#513)" #533

PhilKes · 2024-05-07T15:44:17Z

This reverts commit 8de72b3.

As discussed in #510 this reverts the switch from /completion to /infill

…#513)" This reverts commit 8de72b3.

…-llama-infill # Conflicts: # src/main/kotlin/ee/carlrobert/codegpt/codecompletions/CodeCompletionRequestFactory.kt

CISC · 2024-05-08T14:00:47Z

Hey, just curious why ggerganov/llama.cpp#7102 (comment) prompted this?

This reverts commit 8de72b3.

PhilKes · 2024-05-29T09:17:43Z

Hey, just curious why ggerganov/llama.cpp#7102 (comment) prompted this?

We use GGUF models from HF (e.g https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat-GGUF) which (afaik) do not have the special token modification from your PR (ggerganov/llama.cpp#7166), therefore we can't use them with /infill, and reverted back to using /completion for FIM prompts.

CISC · 2024-05-29T09:22:24Z

You can use mine which does. :)

PhilKes · 2024-05-29T10:31:38Z

Thats nice, but we would need a working GGUF for all models that we support, the list is quite long (see HuggingFaceModel) with different FIM prompt templates (see InfillPromptTemplate) and its continously growing with new models coming up. We want the same solution for every existing and new model, so /infill is no option for us atm

CISC · 2024-05-29T12:12:08Z

In that case I think you will have to maintain your own copies, which shouldn't be that hard though using gguf-new-metadata.py. I doubt anyone else is going to bother manually adding this metadata, and it's unlikely the conversion scripts will (it's not quite working properly on the ones it does add them to already (instruct/chat tuned models can lose fill-in-middle capability even though they still have the tokens)).

PhilKes · 2024-05-29T13:29:37Z

In that case I think you will have to maintain your own copies, which shouldn't be that hard though using gguf-new-metadata.py. I doubt anyone else is going to bother manually adding this metadata, and it's unlikely the conversion scripts will (it's not quite working properly on the ones it does add them to already (instruct/chat tuned models can lose fill-in-middle capability even though they still have the tokens)).

Thanks for your suggestion but its easier for us to just maintain the FIM templates in our project and using the /completion Endpoint, than maintaining our own GGUF copies.

PhilKes added 2 commits May 7, 2024 17:42

Revert "fix: use /infill for llama.cpp code-completions (carlrobertoh…

57d225a

…#513)" This reverts commit 8de72b3.

Merge remote-tracking branch 'refs/remotes/origin/master' into revert…

34c0c51

…-llama-infill # Conflicts: # src/main/kotlin/ee/carlrobert/codegpt/codecompletions/CodeCompletionRequestFactory.kt

carlrobertoh merged commit dcd0a3f into carlrobertoh:master May 8, 2024
2 checks passed

carlrobertoh pushed a commit that referenced this pull request May 13, 2024

Revert "fix: use /infill for llama.cpp code-completions (#513)" (#533)

a173840

This reverts commit 8de72b3.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "fix: use /infill for llama.cpp code-completions (#513)" #533

Revert "fix: use /infill for llama.cpp code-completions (#513)" #533

PhilKes commented May 7, 2024

CISC commented May 8, 2024

PhilKes commented May 29, 2024 •

edited

Loading

CISC commented May 29, 2024

PhilKes commented May 29, 2024 •

edited

Loading

CISC commented May 29, 2024

PhilKes commented May 29, 2024

Revert "fix: use /infill for llama.cpp code-completions (#513)" #533

Revert "fix: use /infill for llama.cpp code-completions (#513)" #533

Conversation

PhilKes commented May 7, 2024

CISC commented May 8, 2024

PhilKes commented May 29, 2024 • edited Loading

CISC commented May 29, 2024

PhilKes commented May 29, 2024 • edited Loading

CISC commented May 29, 2024

PhilKes commented May 29, 2024

PhilKes commented May 29, 2024 •

edited

Loading

PhilKes commented May 29, 2024 •

edited

Loading