server : update prompt on slot restore #9800

ggerganov · 2024-10-09T08:25:30Z

The slot.prompt field is not being updated after restoring a slot state from a file. Since we don't store the original jsonic representation of the prompt, we simply set it to a descriptive message in order to not confuse the reported state of the slot.

Another option might be to detokenize the restored tokens. Not sure if worth it.

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

ngxson · 2024-10-09T10:19:48Z

Another option might be to detokenize the restored tokens. Not sure if worth it.

I don't have any strong preference either. IMO the prompt should not present in the returned slot data at all, as it is an intermediate variable. The real info to be exposed should be the array of tokens in cache.

In anyways, the current slot save/store API is quite low-level, I think we should communicate this in the docs so that user don't expect it to be a prod-ready thing. I'm looking forward to reorganize this feature to match Claude prompt caching API, which should be more intuitive for end-user.

* llama : improve infill support ggml-ci * llama : add more FIM token strings ggml-ci * server : update prompt on slot restore (#9800) * gguf : deprecate old FIM token KVs

…9798) * llama : improve infill support ggml-ci * llama : add more FIM token strings ggml-ci * server : update prompt on slot restore (ggml-org#9800) * gguf : deprecate old FIM token KVs

server : update slot->prompt after restore

6556c90

github-actions bot added examples server labels Oct 9, 2024

ggerganov merged commit 32da4a2 into gg/infill-0 Oct 11, 2024
53 checks passed

ggerganov deleted the gg/server-fix-slot-restore branch October 11, 2024 06:16

ggerganov added a commit that referenced this pull request Oct 11, 2024

server : update prompt on slot restore (#9800)

3ae8670

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server : update prompt on slot restore #9800

server : update prompt on slot restore #9800

ggerganov commented Oct 9, 2024

ngxson commented Oct 9, 2024

server : update prompt on slot restore #9800

server : update prompt on slot restore #9800

Conversation

ggerganov commented Oct 9, 2024

ngxson commented Oct 9, 2024