Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Anthropic LLMs) First two characters of a response are lost #233

Closed
jwr opened this issue Mar 6, 2024 · 11 comments
Closed

(Anthropic LLMs) First two characters of a response are lost #233

jwr opened this issue Mar 6, 2024 · 11 comments
Labels
bug Something isn't working

Comments

@jwr
Copy link

jwr commented Mar 6, 2024

I'm trying out gptel with Anthropic Claude and it seems that with many (but curiously not all, I think?) responses the first two characters are lost somewhere.

For example, with an empty prompt in a conversation:

### 

 seems like you haven't provided any additional context or code. The previous two functions we discussed were:

Or when working in another buffer:

How do I update data in an atom in Clojure?

 Clojure, you can update the of an atom using the `swap! or `reset!` functions.. **Using `swap!`**:
The `swap!` function takes an atom and a function as arguments. It applies the function to the current value of the atom, and sets the atom's value to the result of the function. This is useful when you want to perform a transformation on current value of the atom.
[...]

Note how in the first example the word "It" is missing, and in the second one "In" is missing.

I'm using gptel 0d6264f in "GNU Emacs 29.1 (build 1, aarch64-apple-darwin21.6.0, Carbon Version 165 AppKit 2113.6) of 2023-08-10" (emacs-mac).

@karthink
Copy link
Owner

karthink commented Mar 6, 2024

  1. Are you using gptel-mode/a dedicated chat buffer?
  2. Is this an issue only with the Claude model?
  3. If possible, could you try the previous commit? (If you installed gptel using straight/elpaca, this should be easy to do. Otherwise don't worry about it)

@jwr
Copy link
Author

jwr commented Mar 7, 2024

  1. This happens with output to normal buffers as well as in a dedicated gptel chat buffer.
  2. I cannot reproduce this problem with eb088f2 and gpt-4-turbo-preview from OpenAI.
  3. I went back to eb088f2 and initially responses in buffers were not truncated, but then I tried the dedicated gptel chat buffer and got this:
### Are you familiar with Clojure?

, I'm familiar with Clojure. It's a modern Lisp dialect that runs on the Java Virtual Machine (JVM) and JavaScript engines. Clojure emphasizes functional programming, immutable data structures, and concurrency support through software transactional memory. It has a rich set of data structures and a focus on simplicity and consistency.

### 

But this time it seems that 3 characters ("Yes") are missing?

This is fairly consistent:

### Can you write Clojure code?

, I can write Clojure code. Here's a simple example that defines a function to calculate the factorial of a number:

@jwr
Copy link
Author

jwr commented Mar 7, 2024

I was wrong, it's not two characters in front, but some characters in multiple places in the response.

I am looking at a gptel chat buffer where there are several places where 1-5 characters are missing (streaming chunk boundaries?), not just at the beginning, but in the middle of the response as well. I see this with Anthropic only, not with OpenAI or ollama.

(I am back to current head, or 0d6264f)

@karthink
Copy link
Owner

karthink commented Mar 7, 2024

I am looking at a gptel chat buffer where there are several places where 1-5 characters are missing (streaming chunk boundaries?), not just at the beginning, but in the middle of the response as well. I see this with Anthropic only, not with OpenAI or ollama.

Thank you for the thorough testing, this is very helpful. It's probably a bug in the parser for the Anthropic-API responses. Could you do one more thing to help me track it down?

  1. Run (setq gptel-log-level 'info)
  2. Use Claude, generate some responses with missing chunks.
  3. Paste the contents of the *gptel-log* buffer here. (Check to ensure that the buffer does not contain your API key. At the info log level it shouldn't, but please check anyway)
  4. Paste the chat here as well so I can compare the log and the responses as printed in your working buffer.

@jwr
Copy link
Author

jwr commented Mar 7, 2024

Unfortunately, I currently can't, as Anthropic locked me out when my $5 credit ran out, and even though I recharged the account and it shows a significant balance in the panel, their API responds with 400 codes. And their support mentions that they respond after 5 days 🤣

If somebody else doesn't help in the meantime, this will have to wait until they tie their shoelaces and get their act together, then I'll be able to get back on it!

@solodov
Copy link

solodov commented Mar 7, 2024

I have the same problem, here's my data.

ChatGPT buffer:

### is this on?

, I'm here and ready to assist you. How can I help?

### 

gptel-log

{
  "gptel": "request body",
  "timestamp": "2024-03-07 17:47:00"
}
{
  "model": "claude-3-opus-20240229",
  "messages": [
    {
      "role": "user",
      "content": "is this on?"
    }
  ],
  "system": "You are a large language model living in Emacs and a helpful assistant. Respond concisely.",
  "stream": true,
  "max_tokens": 1024,
  "temperature": 1.0
}
{
  "gptel": "response body",
  "timestamp": "2024-03-07 17:47:05"
}
event: message_start
data: {"type":"message_start","message":{"id":"msg_01CXFfSUxbDeyrqJkwPj1UnU","type":"message","role":"assistant","content":[],"model":"claude-3-opus-20240229","stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":33,"output_tokens":1}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: ping
data: {"type": "ping"}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Yes"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":","}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" I"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"'m"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" here"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" and"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" ready"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" to"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" assist"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" you"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"."}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" How"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" can"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" I"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" help"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"?"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":19}}

event: message_stop
data: {"type":"message_stop"}

The log does show that the very first event is dropped in the chat.

@karthink
Copy link
Owner

karthink commented Mar 7, 2024

@solodov Thank you.

It's strange, I've tried all the prompts suggested in this thread so far and I'm not able to reproduce the missing chunk problem. It works fine on the test case. I'm trying to guess the cause from staring at the parser code now.

karthink added a commit that referenced this issue Mar 8, 2024
gptel-anthropic.el (gptel-curl--parse-stream): Reset point
explicitly when parsing streaming responses returned by the
Anthropic API.  Try to address #233.
@karthink
Copy link
Owner

karthink commented Mar 8, 2024

Since I'm not sure what's causing the parsing problem, I've attempted a fix based on my best guess. Please let me know if it makes a difference.

@karthink karthink changed the title First two characters of a response are lost (Anthropic LLMs) First two characters of a response are lost Mar 8, 2024
@solodov
Copy link

solodov commented Mar 8, 2024 via email

@karthink
Copy link
Owner

karthink commented Mar 8, 2024

Okay. I'll wait until @jwr can access Claude again and check if the bug still persists before closing this issue.

@karthink karthink added the bug Something isn't working label Mar 12, 2024
@jwr
Copy link
Author

jwr commented Mar 13, 2024

Sorry, it took a while for Anthropic to figure out that I do have a positive balance in my account after all.

I can now confirm that the bug is gone, and I get full responses from Anthropic models (tested with fbb0ee2).

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants