Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beginnging of response is cut out / missing #503

Closed
1 task done
onusai opened this issue Mar 22, 2023 · 4 comments
Closed
1 task done

Beginnging of response is cut out / missing #503

onusai opened this issue Mar 22, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@onusai
Copy link

onusai commented Mar 22, 2023

Describe the bug

Ive tested several alpaca models and iv'e noticed that start of the response is almost always missing.

Might be similar to this issue: #300

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

Load Alpaca model https://huggingface.co/elinas/alpaca-30b-lora-int4/tree/main and generate using this prompt:

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Write a python function that adds two numbers

### Response:

Screenshot

image

Logs

Output generated in 4.61 seconds (4.12 tokens/s, 19 tokens)

System Info

Windows 11, RTX3090
@onusai onusai added the bug Something isn't working label Mar 22, 2023
@onusai
Copy link
Author

onusai commented Mar 22, 2023

After further testing, it appears that issue occurs when max_new_tokens is >= 1992. This issue does not occur when max_new_tokens is less than 1992, but it does duplicate the last token of the input.

@oobabooga
Copy link
Owner

This is being caused by your prompt being truncated. Try reducing max_new_tokens.

@onusai
Copy link
Author

onusai commented Mar 22, 2023

Yep that seems to fix it. Is there a reason this isnt automatically calculated? something like max_new_tokens = contex_size - len(prompt)?

@oobabooga
Copy link
Owner

oobabooga commented Mar 22, 2023

For generating long text (like a long novel) in several iterations (clicking on Continue multiple times), it is convenient to truncate the beginning of the text to make room for more tokens. This is related to #498 btw

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants