How to use RunnableWithMessageHistory in a RAG pipeline #16582

ChristianEvc · 2024-01-25T16:33:40Z

ChristianEvc
Jan 25, 2024

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

def get_chain(vectorstore: Pinecone, stream_handler, ) -> RunnableParallel:

    streaming_llm = ChatOpenAI(
        model="gpt-4",
        streaming=True,
        callbacks=[stream_handler],
        verbose=True,
        temperature=0,
        openai_api_key=OPENAI_API_KEY
    )

    # RAG prompt
    template = (variables["promptTemplate"]+
    """
    Context: {context}
    Question: {question}
    History: {history}
    """)

    prompt = ChatPromptTemplate.from_template(template)

    
    retriever = vectorstore.as_retriever(search_kwargs={"k": variables["numberOfKwargs"]})

    # Re-rank
    compressor = CohereRerank(cohere_api_key=COHERE_API_KEY, top_n=variables["numberOfReranks"])
    compression_retriever = ContextualCompressionRetriever(
        base_compressor=compressor, base_retriever=retriever
    )

    # RAG chain with source metadata

    testChain = (
        {
            "context": compression_retriever,
            "question": itemgetter("question"),
            "history": ???
        } 
        | prompt 
        | streaming_llm 
        | StrOutputParser()
    )

    qa2 = RunnableWithMessageHistory(
        testChain,
        lambda session_id: RedisChatMessageHistory(session_id, url=REDIS_URL),
        input_messages_key="question",
        history_messages_key="history",
    )
    
    class Question(BaseModel):
        __root__: str

    qa2 = qa2.with_types(input_type=Question)

    print("qa", qa2)
    return qa2

Description

I've created a function that starts a chain. I've been using this without memory added to it for some time, and its been working great. However, now I'm trying to add memory to it, using REDIS memory (following the examples on the langchain docs).

However, the example there only uses the memory. Testing that, it works fine. However, I need to maintain my context, but if I try to add context to "ChatPromptTemplate.from_messages" it screams at me saying its not one of the accepted inputs.

And if I try to add history to "ChatPromptTemplate.from_template", it also screams, saying that something (I'm at a loss as to what) saying: "TypeError: expected string or buffer"

I think I'm just passing the history object in the wrong way in the testChain object, but I barely even know if I'm on the right track.

System Info

langchain==0.1.3
langchain-community==0.0.15
langchain-core==0.1.15

Answered by eyurtsev

Jan 25, 2024

@ChristianEvc Here's a reference imlpementation.

This code is broken down into steps to make it easier to invoke each step to see the inputs / outputs.

You can always debug by sprinkling through the LCEL pipeline.

def print_me(inputs):
   print(inputs)
  return inputs

The code contains an in memory implementation of chat history to make it easier to run the code without any complex set up and understand what's happening under the hood.

from operator import itemgetter
from typing import List

from langchain_openai.chat_models import ChatOpenAI

from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.documents import Document
from langchain_core.messages import Ba…

View full answer

@dosu-bot · 2024-01-25T16:33:59Z

dosubot[bot]
bot Jan 25, 2024

🤖

From your code and description, it seems like you're trying to incorporate both context and chat history into your RAG pipeline. The issue you're facing is likely due to the way you're trying to pass the history object in the testChain object.

In LangChain, the RunnableWithMessageHistory class is used to handle the history of messages. It wraps a runnable and adds a history of messages to the input of the runnable. It uses a session history store to keep track of the messages for each session.

When you're creating your RunnableWithMessageHistory instance, you're passing input_messages_key="question" and history_messages_key="history". This means that the RunnableWithMessageHistory is expecting the input messages to be under the key "question" and the history messages to be under the key "history".

However, in your testChain object, you're passing the history object under the key "history", but it's not clear where the input messages are coming from. This could be the cause of the "TypeError: expected string or buffer" error you're seeing.

To fix this, you need to ensure that the input messages are being passed to the RunnableWithMessageHistory under the key "question". You can do this by modifying your testChain object like this:

testChain = (
    {
        "context": compression_retriever,
        "question": itemgetter("question"),
        "history": itemgetter("history")
    } 
    | prompt 
    | streaming_llm 
    | StrOutputParser()
)

Here, itemgetter("history") is used to get the history messages from the input. This assumes that the input is a dictionary with the keys "question" and "history".

I hope this helps! If you're still having issues, could you provide more information about the input you're passing to the RunnableWithMessageHistory and the exact error message you're seeing?

Sources

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

1 reply

ChristianEvc Jan 25, 2024
Author

"history": itemgetter("history") Is exactly what I tried, but that didn't work.

eyurtsev · 2024-01-25T18:17:24Z

eyurtsev
Jan 25, 2024
Maintainer

@ChristianEvc Here's a reference imlpementation.

This code is broken down into steps to make it easier to invoke each step to see the inputs / outputs.

You can always debug by sprinkling through the LCEL pipeline.

def print_me(inputs):
   print(inputs)
  return inputs

The code contains an in memory implementation of chat history to make it easier to run the code without any complex set up and understand what's happening under the hood.

from operator import itemgetter
from typing import List

from langchain_openai.chat_models import ChatOpenAI

from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.documents import Document
from langchain_core.messages import BaseMessage, AIMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.runnables import (
    RunnableLambda,
    ConfigurableFieldSpec,
    RunnablePassthrough,
)
from langchain_core.runnables.history import RunnableWithMessageHistory


class InMemoryHistory(BaseChatMessageHistory, BaseModel):
    """In memory implementation of chat message history."""

    messages: List[BaseMessage] = Field(default_factory=list)

    def add_message(self, message: BaseMessage) -> None:
        """Add a self-created message to the store"""
        self.messages.append(message)

    def clear(self) -> None:
        self.messages = []


store = {}


def get_session_history(user_id: str, conversation_id: str) -> BaseChatMessageHistory:
    if (user_id, conversation_id) not in store:
        store[(user_id, conversation_id)] = InMemoryHistory()
    return store[(user_id, conversation_id)]


history = get_session_history("1", "1")
history.add_message(AIMessage(content="hello"))
print(store)


def fake_retriever(query):
    assert isinstance(query, str)
    return [
        Document(page_content="cats are the answer"),
        Document(page_content="CAT POWERS"),
    ]


fake_retriever = RunnableLambda(fake_retriever)

print(fake_retriever.invoke("hello"))

model = ChatOpenAI()

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You're an assistant who's good at {ability}. Here is some {context}",
        ),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


context = itemgetter("question") | fake_retriever | format_docs
first_step = RunnablePassthrough.assign(context=context)
chain = first_step | prompt | model

with_message_history = RunnableWithMessageHistory(
    chain,
    get_session_history=get_session_history,
    input_messages_key="question",
    history_messages_key="history",
    history_factory_config=[
        ConfigurableFieldSpec(
            id="user_id",
            annotation=str,
            name="User ID",
            description="Unique identifier for the user.",
            default="",
            is_shared=True,
        ),
        ConfigurableFieldSpec(
            id="conversation_id",
            annotation=str,
            name="Conversation ID",
            description="Unique identifier for the conversation.",
            default="",
            is_shared=True,
        ),
    ],
)

print(context.invoke({"question": "hello", "other_stuff": "ignored"}))

print(first_step.invoke({"question": "hello", "history": "boom"}))

print(chain.invoke({"question": "hello", "history": [], "ability": "math"}))

print(
    with_message_history.invoke(
        {"ability": "math", "question": "What does cosine mean?"},
        config={
            "configurable": {"user_id": "user_id", "conversation_id": "conversation_id"}
        },
    )
)

print(store)

11 replies

minji-o-j Apr 29, 2024

Thank you sooooo much

harris May 4, 2024

Hi I am using langserve and i am getting these errors (I can run the example without errors, this only happens when mounting the chain on langserve):

  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/starlette/routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/starlette/routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/starlette/routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/starlette/routing.py", line 72, in app
    response = await func(request)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/fastapi/routing.py", line 278, in app
    raw_response = await run_endpoint_function(
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
    return await dependant.call(**values)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langserve/server.py", line 530, in invoke
    return await api_handler.invoke(request)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langserve/api_handler.py", line 833, in invoke
    output = await invoke_coro
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 4529, in ainvoke
    return await self.bound.ainvoke(
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 4529, in ainvoke
    return await self.bound.ainvoke(
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2536, in ainvoke
    input = await step.ainvoke(
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 4529, in ainvoke
    return await self.bound.ainvoke(
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2536, in ainvoke
    input = await step.ainvoke(
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langchain_core/prompts/base.py", line 143, in ainvoke
    return await self._acall_with_config(
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 1677, in _acall_with_config
    output = await coro
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langchain_core/prompts/base.py", line 117, in _aformat_prompt_with_error_handling
    _inner_input = self._validate_input(inner_input)
  File "/Users/harris/code/meow/.venv/lib/python3.10/site-packages/langchain_core/prompts/base.py", line 103, in _validate_input
    raise KeyError(
KeyError: "Input to ChatPromptTemplate is missing variables {'ability'}.  Expected: ['ability', 'context', 'history', 'question'] Received: ['question', 'history', 'context']"

harris May 4, 2024

filed here also: langchain-ai/langserve#630

Arslan-Mehmood1 May 13, 2024

@weissenbacherpwc

check this code, it contains some instructions for trimming prompt etc.

https://github.com/langchain-ai/langserve/blob/main/examples/agent_with_history/server.py

weissenbacherpwc Jun 18, 2024

thanks @Arslan-Mehmood1

@eyurtsev how would the equivalent look like in Langgraph? As many people switching to Lnaggrpah this could be interesting

andysingal · 2024-02-12T13:02:07Z

andysingal
Feb 12, 2024

@eyurtsev Thanks for this amazing code, is it possible to use with Agents?
i created a Message Broker and would like to know how to use it? do i also need to use? chatbuffer memory?
i need to store the chats in json then store them either in a Chat Model or in the unique model for each user with metadata. These metadata will allow the user to review the chats, then decide to keep them in the db or not.

class MessageBroker():
      

    def write_log(self, filename):
        with open(filename, "w") as f:
            for message in self.message_log:
                f.write(f"{message[0]}:\n")
                f.write(f"{message[1]}\n\n")

# Create the message broker.
message_broker = MessageBroker()

0 replies

rongchenlin · 2024-03-01T04:41:07Z

rongchenlin
Mar 1, 2024

I can run the code with MongoDBChatMessageHistory. this is the example:

from operator import itemgetter
from langchain_community.chat_message_histories import MongoDBChatMessageHistory
from langchain_core.documents import Document
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import (
    RunnableLambda,
)
from langchain_core.runnables import RunnablePassthrough
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI


def fake_retriever(query):
    assert isinstance(query, str)
    return [
        Document(page_content="人工智能快速发展"),
        Document(page_content="机器学习功不可没"),
    ]


fake_retriever = RunnableLambda(fake_retriever)

model = ChatOpenAI()

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You're an assistant who's good at something. Here is some {context}",
        ),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


context = itemgetter("question") | fake_retriever | format_docs
first_step = RunnablePassthrough.assign(context=context)
chain = first_step | prompt | ChatOpenAI()

with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: MongoDBChatMessageHistory(
        session_id=session_id,
        connection_string="mongodb://root:password@your_ip:27017",
        database_name="my_dbxxx",
        collection_name="tttttt",
    ),
    input_messages_key="question",
    history_messages_key="history",
)

if __name__ == '__main__':
    config = {"configurable": {"session_id": "test_session_id"}}
    result = with_message_history.invoke({"question": "请你为内容生成一个标题"}, config)
    print(result)

@ChristianEvc

0 replies

sbachstein · 2024-03-21T16:09:26Z

sbachstein
Mar 21, 2024

Hi everybody, I have a follow-up question for roughly the same scenario, however with one small difference:

I would also like to add the retrieved context to the history. In the examples here, this is not happening because we only reference the question key, which does not include the context.
I was experimenting with an intermediate ChatPromptTemplate containing two MessagesPlaceholders, one for the combined input (from question and context messages) and one for the history. I would then use that to create the RunnableWithMessageHistory and then add the retrieval step before that. However, the retrieval step returns a ChatPromptValue while the MessagesPlaceholder expects a list of BaseMessages. Not sure how to get from one to the other, do you have an idea?

Some minimal code to get the idea:

placeholder_prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="history"),
    MessagesPlaceholder(variable_name="input"),
])

llm = <SOME_LLM>

chain = placeholder_prompt | llm

chain_with_history = RunnableWithMessageHistory(
    chain,
    <SOME_SESSION_STORE>,
    input_messages_key="input",
    history_messages_key="history",
)

prompt = ChatPromptTemplate.from_messages([
    ("system", "Context: {context}"),
    ("human", "{question}"),
])

complete_chain_with_history = {"input": prompt} | chain_with_history
# That's where the type mismatch happens.
# How can I extract the messages from the ChatPromptTemplate?
# Can I replace the ChatPromptTemplate with something better?

EDIT: I think I found a way using a RunnableLambda that returns a list of messages but that means I cannot use the native Langchain classes and templating and also don't get the inputs generated automatically (which is nice for the Langserve playground/API)

EDIT: Another (better) solution I found was to use a RunnableLambda to extract the list of messages:

complete_chain_with_history = prompt | RunnableLambda(lambda v: {"input": v.messages}) | chain_with_history

Maybe there are better (native) solutions?

0 replies

live2awesome · 2024-05-30T06:15:32Z

live2awesome
May 30, 2024

if i have this code

from langchain import hub
from langchain_openai import ChatOpenAI
retriever = vectorstore.as_retriever()
prompt = hub.pull("rlm/rag-prompt")
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)

from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

rag_chain_from_docs = (
RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"])))
| prompt
| llm
| StrOutputParser()
)

rag_chain_with_source = RunnableParallel(
{"context": retriever, "question": RunnablePassthrough()}
).assign(answer=rag_chain_from_docs)

response = rag_chain_with_source.invoke("What is LSTM?")

How to convert this type runnableparallel with history component

i got this reference from langchain docs

def get_session_history(session_id: str) -> BaseChatMessageHistory:
if session_id not in store:
store[session_id] = ChatMessageHistory()
return store[session_id]

with_message_history = RunnableWithMessageHistory(
chain,
get_session_history,
output_messages_key="output_message",
)

with_message_history.invoke(
[HumanMessage(content="What did Simone de Beauvoir believe about free will")],
config={"configurable": {"session_id": "baz"}},
)

How can we integrate RunnableWithMessageHistory with runnableparallel ?

0 replies

zeinabfarhoudi · 2024-09-08T08:27:26Z

zeinabfarhoudi
Sep 8, 2024

@eyurtsev , Thanks for the amazing code. I want to add create_history_aware_retriever and create_stuff_documents_chain to your code, How to rewrite it?

0 replies

@dosu · 2024-09-08T08:29:44Z

dosubot[bot]
bot Sep 8, 2024

I found a similar discussion that might help you: How to input multiple inputs in RunnableWithMessageHistory of LangChain?. This discussion provides an example of how to correctly handle multiple inputs with RunnableWithMessageHistory in LangChain ^[1].

Here's how you can modify your code to include the history correctly:

Define the prompt template:

template = (variables["promptTemplate"] +
"""
Context: {context}
Question: {question}
History: {history}
""")
prompt = ChatPromptTemplate.from_template(template)

Set up the RunnableWithMessageHistory:

qa2 = RunnableWithMessageHistory(
    testChain,
    lambda session_id: RedisChatMessageHistory(session_id, url=REDIS_URL),
    input_messages_key="question",
    history_messages_key="history",
)

Ensure the testChain is correctly configured:

testChain = (
    {
        "context": compression_retriever,
        "question": itemgetter("question"),
        "history": itemgetter("history")  # Ensure history is correctly referenced
    } 
    | prompt 
    | streaming_llm 
    | StrOutputParser()
)

Invoke the chain:

class Question(BaseModel):
    __root__: str

qa2 = qa2.with_types(input_type=Question)

print("qa", qa2)
return qa2

This setup ensures that the history is correctly passed and managed within the RunnableWithMessageHistory. If you encounter any issues, make sure that the history key is correctly referenced and that the RedisChatMessageHistory is properly initialized.

For more detailed information, you can refer to the LangChain documentation on message history.

To continue talking to Dosu, mention @dosu.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use RunnableWithMessageHistory in a RAG pipeline #16582

{{title}}

Replies: 8 comments 12 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

How to use RunnableWithMessageHistory in a RAG pipeline #16582

Checked other resources

Commit to Help

Example Code

Description

System Info

Replies: 8 comments · 12 replies

dosubot[bot] bot Jan 25, 2024

Sources

ChristianEvc Jan 25, 2024 Author

eyurtsev Jan 25, 2024 Maintainer

dosubot[bot] bot Sep 8, 2024

Replies: 8 comments 12 replies

dosubot[bot]
bot Jan 25, 2024

ChristianEvc Jan 25, 2024
Author

eyurtsev
Jan 25, 2024
Maintainer

dosubot[bot]
bot Sep 8, 2024