Skip to content
This repository has been archived by the owner on May 10, 2023. It is now read-only.

Add compatibility with new sampling algorithms in llama.cpp #219

Closed
wants to merge 2 commits into from
Closed

Add compatibility with new sampling algorithms in llama.cpp #219

wants to merge 2 commits into from

Conversation

kuvaus
Copy link

@kuvaus kuvaus commented Apr 30, 2023

Title: Add compatibility with new sampling algorithms in llama.cpp

Description: This pull request addresses issue #200 (comment) by adding compatibility with new sampling algorithms in llama.cpp.

Changes:

Implemented temperature sampling with repetition penalty as an alternative to the previous llama_sample_top_p_top_k sampling method.

        // Temperature sampling with repetition_penalty
        llama_sample_repetition_penalty(
            d_ptr->ctx, &candidates_data,
            promptCtx.tokens.data() + promptCtx.n_ctx - promptCtx.repeat_last_n, promptCtx.repeat_last_n,
            promptCtx.repeat_penalty);
        llama_sample_top_k(d_ptr->ctx, &candidates_data, promptCtx.top_k);
        llama_sample_top_p(d_ptr->ctx, &candidates_data, promptCtx.top_p);
        llama_sample_temperature(d_ptr->ctx, &candidates_data, promptCtx.temp);
        llama_token id = llama_sample_token(d_ptr->ctx, &candidates_data);

@manyoso
Copy link
Collaborator

manyoso commented Apr 30, 2023

I will look at this, but will need to update the submodule at the same time otherwise this will break. But this helps a ton! Thanks @kuvaus !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants