Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anthropic system message fix #11301

Merged
merged 12 commits into from
Oct 4, 2023
Merged

Conversation

baskaryan
Copy link
Collaborator

Bedrock anthropic api enforces that Human and Assistant messages must be interleaved (cannot have same type twice in a row). We current treat System Messages as human messages when converting messages -> string prompt. Our validation when using Bedrock/BedrockChat raises an error when this happens. For ChatAnthropic we don't validate this so no error is raised, but perhaps the behavior is still suboptimal.

Would love input from folks more familiar with Anthropic intended usage and Bedrock API.

@vercel
Copy link

vercel bot commented Oct 2, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Oct 4, 2023 0:27am

@baskaryan baskaryan changed the title RFC: Anthropic system message fix Anthropic system message fix Oct 2, 2023
@baskaryan baskaryan marked this pull request as ready for review October 2, 2023 18:42
@3coins
Copy link
Contributor

3coins commented Oct 2, 2023

@baskaryan
Thanks for working on this fix. Looking at the Bedrock playground, they tend to send the system message as Instructions: <system_prompt>, rather than the tags that you have specified here. Here is an example:

Instructions: You are an assistant.
The following is a friendly conversation between you and a human.


Human: What is the capitol of France?

Assistant:

And here is with the contextual history.

  What is the capitol of France?
 The capital of France is Paris.
Instructions: You are an assistant.
The following is a friendly conversation between you and a human.


Human: How about Mexico?

Assistant:

@3coins
Copy link
Contributor

3coins commented Oct 2, 2023

This tend to work too in the Bedrock playground, haven't tried via the API yet.

<admin>You are a friendly assistant</admin>

Human: How do decorators work in python?

Assistant:

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Oct 2, 2023
@3coins
Copy link
Contributor

3coins commented Oct 3, 2023

@baskaryan
Also, Bedrock Anthropic only seem to allow for Human/Assistant roles, anything else errors out at the moment, this means that the chat_model, which formats the history with AI, is breaking any chats with memory/history. Is there a way to get this working?

@baskaryan
Copy link
Collaborator Author

@baskaryan
Also, Bedrock Anthropic only seem to allow for Human/Assistant roles, anything else errors out at the moment, this means that the chat_model, which formats the history with AI, is breaking any chats with memory/history. Is there a way to get this working?

Where do you see it formats with AI? Believe everything Anthropic uses the formatting function being edited here (which uses Assistant)

SystemMessage(content="You're an assistant"),
AIMessage(content="Answer:"),
],
"You're an assistant\n\nAssistant: Answer:",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this test case would error out if you passed it to our API because there's no \n\nHuman:

@3coins
Copy link
Contributor

3coins commented Oct 3, 2023

Where do you see it formats with AI? Believe everything Anthropic uses the formatting function being edited here (which uses Assistant)

For Bedrock Anthropic, which doesn't inherit for Anthropic, the following example will produce AI:, even if you explicitly use an Assistant role in the chat prompt template. I had to use a custom placeholder template to convert messages to Assistant:.

SYSTEM_PROMPT = "You are a friendly assistant. This is a conversation between a human and an assistant."
prompt_template = ChatPromptTemplate.from_messages(
    [
        ChatMessage(
            role="Instructions",
            content=SYSTEM_PROMPT
        ),
        HistoryPlaceholderTemplate(variable_name="history"),
        HumanMessagePromptTemplate.from_template("{input}"),
        ChatMessage(role="Assistant", content=""),
    ]
)
llm = BedrockChat(model_id="")
memory = ConversationBufferWindowMemory(return_messages=True, k=2)
llm_chain = ConversationChain(
    llm=llm, prompt=prompt_template, memory=memory, verbose=True
)

See my PR here to fix this in Jupyter AI. Ideally, we will want the this function from Anthropic in BedrockChat, which formats the AI messages correctly.
https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chat_models/anthropic.py#L43

@baskaryan
Copy link
Collaborator Author

Where do you see it formats with AI? Believe everything Anthropic uses the formatting function being edited here (which uses Assistant)

For Bedrock Anthropic, which doesn't inherit for Anthropic, the following example will produce AI:, even if you explicitly use an Assistant role in the chat prompt template. I had to use a custom placeholder template to convert messages to Assistant:.

SYSTEM_PROMPT = "You are a friendly assistant. This is a conversation between a human and an assistant."
prompt_template = ChatPromptTemplate.from_messages(
    [
        ChatMessage(
            role="Instructions",
            content=SYSTEM_PROMPT
        ),
        HistoryPlaceholderTemplate(variable_name="history"),
        HumanMessagePromptTemplate.from_template("{input}"),
        ChatMessage(role="Assistant", content=""),
    ]
)
llm = BedrockChat(model_id="")
memory = ConversationBufferWindowMemory(return_messages=True, k=2)
llm_chain = ConversationChain(
    llm=llm, prompt=prompt_template, memory=memory, verbose=True
)

See my PR here to fix this in Jupyter AI. Ideally, we will want the this function from Anthropic in BedrockChat, which formats the AI messages correctly. https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chat_models/anthropic.py#L43

chat bedrock does use that function

prompt = convert_messages_to_prompt_anthropic(messages=messages)

looking at trace in your example seems "Assistant" is being appended twice in a row at the end. taking a closer look now

@baskaryan
Copy link
Collaborator Author

looking at trace in your example seems "Assistant" is being appended twice in a row at the end. taking a closer look now

ah the error is because we use a ChatMessage at the end, it should be an AIMessage. but really no need to append either at the end anyways, handled for you by the anthropic prompt formatting logic

@rsgrewal-aws
Copy link

Could we instead of error - throw a Warning --

raise ValueError(ALTERNATION_ERROR)

@3coins
Copy link
Contributor

3coins commented Oct 3, 2023

@baskaryan
I checked out this branch, and tested the example I had added before (I removed the ChatMessage at the end). Here is the log output.


> Entering new ConversationChain chain...
Prompt after formatting:
System: You are a friendly assistant. This is a conversation between a human and an assistant.
Human: 1 + 1 =

> Finished chain.


> Entering new ConversationChain chain...
Prompt after formatting:
System: You are a friendly assistant. This is a conversation between a human and an assistant.
Human: 1 + 1 =
AI:  2
Human: + 3 =

It seems to work with the Bedrock Anthropic, but I am not able to see the raw messages (LangSmith is also showing formatted messages). Is there a way to print out what is exactly sent to the LLM?

@3coins
Copy link
Contributor

3coins commented Oct 3, 2023

The setup errors out with the regular Bedrock LLM.

from langchain.chains import ConversationChain
from langchain.chat_models import BedrockChat
from langchain.llms import Bedrock
from langchain.memory import ConversationBufferWindowMemory
from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, MessagesPlaceholder, HumanMessagePromptTemplate
from langchain.schema import ChatMessage

SYSTEM_PROMPT = "You are a friendly assistant. This is a conversation between a human and an assistant."


def create_chain():
    prompt_template = ChatPromptTemplate.from_messages(
        [
            SystemMessagePromptTemplate.from_template(SYSTEM_PROMPT),
            MessagesPlaceholder(variable_name="history"),
            HumanMessagePromptTemplate.from_template("{input}")
        ]
    )
    #llm = BedrockChat(model_id="anthropic.claude-v1", region_name="us-west-2")
    llm = Bedrock(model_id="anthropic.claude-v1", region_name="us-west-2")
    memory = ConversationBufferWindowMemory(return_messages=True, k=2)
    llm_chain = ConversationChain(llm=llm, prompt=prompt_template, memory=memory, verbose=True)

    return llm_chain

if __name__ == "__main__":
    chain = create_chain()
    chain.predict(input="1 + 1 =")
    chain.predict(input="+ 3 =")

Output

See output
(langchain-py3.9) (base) ➜  test-bedrock-anthropic python test_system_fix.py


> Entering new ConversationChain chain...
Prompt after formatting:
System: You are a friendly assistant. This is a conversation between a human and an assistant.
Human: 1 + 1 =

> Finished chain.


> Entering new ConversationChain chain...
Prompt after formatting:
System: You are a friendly assistant. This is a conversation between a human and an assistant.
Human: 1 + 1 =
AI:  2
Human: + 3 =
Traceback (most recent call last):
  File "/Users/pijain/projects/test-bedrock-anthropic/test_system_fix.py", line 34, in <module>
    chain.predict(input="+ 3 =")
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/chains/llm.py", line 257, in predict
    return self(kwargs, callbacks=callbacks)[self.output_key]
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/chains/base.py", line 311, in __call__
    raise e
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/chains/base.py", line 305, in __call__
    self._call(inputs, run_manager=run_manager)
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/chains/llm.py", line 93, in _call
    response = self.generate([inputs], run_manager=run_manager)
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/chains/llm.py", line 103, in generate
    return self.llm.generate_prompt(
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/llms/base.py", line 509, in generate_prompt
    return self.generate(prompt_strings, stop=stop, callbacks=callbacks, **kwargs)
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/llms/base.py", line 658, in generate
    output = self._generate_helper(
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/llms/base.py", line 546, in _generate_helper
    raise e
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/llms/base.py", line 533, in _generate_helper
    self._generate(
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/llms/base.py", line 1053, in _generate
    self._call(prompt, stop=stop, run_manager=run_manager, **kwargs)
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/llms/bedrock.py", line 383, in _call
    return self._prepare_input_and_invoke(prompt=prompt, stop=stop, **kwargs)
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/llms/bedrock.py", line 225, in _prepare_input_and_invoke
    input_body = LLMInputOutputAdapter.prepare_input(provider, prompt, params)
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/llms/bedrock.py", line 76, in prepare_input
    input_body["prompt"] = _human_assistant_format(prompt)
  File "/Users/pijain/projects/langchain-dev/langchain/libs/langchain/langchain/llms/bedrock.py", line 45, in _human_assistant_format
    raise ValueError(ALTERNATION_ERROR)
ValueError: Error: Prompt must alternate between '

Human:' and '

Assistant:'.

@baskaryan
Copy link
Collaborator Author

The setup errors out with the regular Bedrock LLM.

think this happens because of the mix of a ChatPromptTemplate with an LLM. if you use BedrockChat (which is probably the way claude should be used) it works, or using a PromptTemplate should work, too (see below).

def create_chain():
    prompt_template = PromptTemplate.from_template(
        SYSTEM_PROMPT + "\n\n{history}\n\n{input}"
    )
    llm = Bedrock(model_id="anthropic.claude-v1", client=aws_client)
    memory = ConversationBufferWindowMemory(k=2, ai_prefix="Assistant")
    llm_chain = ConversationChain(llm=llm, prompt=prompt_template, memory=memory, verbose=True)

    return llm_chain

We should update ChatPromptTemplate to make the AIMessage prefix configurable when a chat prompt is converted to a string, but that's probably a separate pr

@baskaryan
Copy link
Collaborator Author

It seems to work with the Bedrock Anthropic, but I am not able to see the raw messages (LangSmith is also showing formatted messages). Is there a way to print out what is exactly sent to the LLM?

no easy way to see raw string just yet but we're working to add that (making it easy to see the final request made to an LLM)!

@baskaryan
Copy link
Collaborator Author

Could we instead of error - throw a Warning --

raise ValueError(ALTERNATION_ERROR)

i like that idea, generally think we should let the integration actually raise validation errors. but maybe also a bigger discussion / for a separate pr. thoughts at @zack-anthropic @hwchase17

@baskaryan baskaryan merged commit b499de2 into master Oct 4, 2023
32 checks passed
@baskaryan baskaryan deleted the bagatur/anthropic_system_msg_fix branch October 4, 2023 15:32
@rsgrewal-aws
Copy link

The setup errors out with the regular Bedrock LLM.

think this happens because of the mix of a ChatPromptTemplate with an LLM. if you use BedrockChat (which is probably the way claude should be used) it works, or using a PromptTemplate should work, too (see below).

def create_chain():
    prompt_template = PromptTemplate.from_template(
        SYSTEM_PROMPT + "\n\n{history}\n\n{input}"
    )
    llm = Bedrock(model_id="anthropic.claude-v1", client=aws_client)
    memory = ConversationBufferWindowMemory(k=2, ai_prefix="Assistant")
    llm_chain = ConversationChain(llm=llm, prompt=prompt_template, memory=memory, verbose=True)

    return llm_chain

We should update ChatPromptTemplate to make the AIMessage prefix configurable when a chat prompt is converted to a string, but that's probably a separate pr

what would be the impact of this to the end users, do we need to in the prompt template always add these parameters ?

@christopherwerner
Copy link

christopherwerner commented Feb 10, 2024

It seems to work with the Bedrock Anthropic, but I am not able to see the raw messages (LangSmith is also showing formatted messages). Is there a way to print out what is exactly sent to the LLM?

no easy way to see raw string just yet but we're working to add that (making it easy to see the final request made to an LLM)!

Is there a PR where this is happening? I was debugging this with the StdOutCallbackHandler and was misled into thinking that prompts with "System:" were being sent to Claude v2 via Bedrock until I found this thread and started putting breakpoints in the client call.

The CallbackManager is passed right down to the place where the boto3 client is called with the final prompt in llms/bedrock.py - would it make sense to emit text with "Final Prompt: " on the callback handler?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: models Related to LLMs or chat model modules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants