Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve #1047: LLM caching: Bug when sharing prefix in target #1048

Merged
merged 1 commit into from
Feb 19, 2024

Commits on Feb 19, 2024

  1. LLM caching: Bug when sharing prefix in target

    When caching labels we're assuming that each encoded label
    has an EOS token. This is not given with every tokenizer.
    For example the GPT2 tokenizer doesn't do this.
    
    Without the EOS token labels with shared prefixes, e.g.
    '11' and '11111' (= '11' + '111'), will both have cache
    entries for the shared prefix '11' but will have different
    total label lengths (in this case 1 vs. 2 tokens).
    This then leads to the scenario that, when generating logits
    for label '11' we will have a 'next' cache entry (for '111')
    but no more label left. The code only checks for the EOS token
    (which is not present) and we run into an index error.
    
    The solution is, in this case, to also check if the label
    we want logits for is already completely checked.
    ottonemo committed Feb 19, 2024
    Configuration menu
    Copy the full SHA
    8c44e5f View commit details
    Browse the repository at this point in the history