Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch for using LCEL to stream from LLM #5873

Closed
5 tasks done
welljsjs opened this issue Jun 22, 2024 · 4 comments · Fixed by #5874, prathik2401/sentilog-deploy#4 or abdulrahman305/langchain-chatbot-demo#6 · May be fixed by abdulrahman305/LocalAI#3 or gmickel/memorybot#12
Closed
5 tasks done
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@welljsjs
Copy link
Contributor

welljsjs commented Jun 22, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

const loader = new TextLoader("data/example2.txt");
  const docs = await loader.load();
  const textSplitter = new RecursiveCharacterTextSplitter({
    chunkSize: 400,
    chunkOverlap: 200,
    separators: ["\n\n", "\n", "."],
    keepSeparator: false,
  });

  const simpleHash = (str: string) => {
    let hash = 0;
    for (let i = 0; i < str.length; i++) {
      const char = str.charCodeAt(i);
      hash = (hash << 5) - hash + char;
    }
    // Convert to 32bit unsigned integer in base 36 and pad with "0" to ensure length is 7.
    return (hash >>> 0).toString(36).padStart(7, "0");
  };

  const splits = await textSplitter.splitDocuments(docs);

  const embeddings = new HuggingFaceTransformersEmbeddings({
    model: "Xenova/all-MiniLM-L6-v2", 
  });

  const vectorStore = new Chroma(embeddings, {
    collectionName: config.chroma.collectionName,
    url: `http://${config.chroma.host}:${config.chroma.port}`,
    collectionMetadata: {
      "hnsw:space": "cosine",
    }, 
  });
  vectorStore.ensureCollection();

  splits.forEach(
    (split) => (split.metadata.id = simpleHash(split.pageContent))
  );
  vectorStore.addDocuments(splits, {
    ids: splits.map((split) => split.metadata.id),
  });

  // Retrieve and generate using the relevant snippets of the blog.
  const retriever = vectorStore.asRetriever({ k: 5, searchType: "similarity" });
  const prompt = HumanMessagePromptTemplate.fromTemplate(
    `You're a virtual assistant to help with technical queries. Answer the following question referring to the given context below.


    Context: {context}
    

    Question: {question}`
  );

  const model = new HuggingFaceInference({
    model: "mistralai/Mistral-7B-Instruct-v0.3",
    apiKey: config.huggingface.apiToken,
    maxRetries: 1,
    maxTokens: config.huggingface.llmMaxTokens,
    verbose: true,
  });

  const outputParser = new StringOutputParser();

  const mergeDocsToString = (docs: Document<Record<string, any>>[]) =>
    docs.map((doc) => doc.pageContent).join("\n\n");

  const setupAndRetrieval = RunnableMap.from({
    context: new RunnableLambda({
      func: (input: string) => retriever.invoke(input).then(mergeDocsToString),
    }).withConfig({ runName: "contextRetriever" }),
    question: new RunnablePassthrough(),
  });

  // Construct our RAG chain.
  const chain = setupAndRetrieval.pipe(prompt).pipe(model).pipe(outputParser);
  const s = await chain.stream("Hello, why is too much sugar bad for the human body?");
  console.log(s);

Error Message and Stack Trace (if applicable)

No response

Description

I'm trying to use LangChainJS to set up a simple RAG pipeline with an LLM. I'm trying to call "stream" on the chain rather than "invoke" to stream the output of the LLM.

Expected behaviour is that the final outputs of calling invoke and stream on the chain are the same, given that the input is the same.

Instead, what is actually happening is that the output differs. The output when calling stream on the chain does not seem to make any sense at all and is not related to the input. This is due to a bug in the llms.js file. I've managed to fix the problem, see the patch below. Within the implementation of the "async *_streamIterator" method of the BaseLLM class, line 65, the prompt value should be passed as first argument to "this._streamResponseChunks". However, instead of passing the prompt value as a string (which would be "prompt.toString()"), "input.toString()" is passed, which results in a bug since "input.toString()" always evaluates to the string "[object Object]", not the content of the prompt.

Note that my patch works for the generated JS, not the TS.
For TS, this line: https://github.com/langchain-ai/langchainjs/blob/b311ec5c19cd4ab7aad116e81fb1ea33c5d71a8d/langchain-core/src/language_models/llms.ts#L159C11-L159C25 should be changed to prompt.toString() instead of input.toString().

diff --git a/node_modules/@langchain/core/dist/language_models/llms.js b/node_modules/@langchain/core/dist/language_models/llms.js
index 70466ae..06e0349 100644
--- a/node_modules/@langchain/core/dist/language_models/llms.js
+++ b/node_modules/@langchain/core/dist/language_models/llms.js
@@ -62,7 +62,7 @@ export class BaseLLM extends BaseLanguageModel {
                 text: "",
             });
             try {
-                for await (const chunk of this._streamResponseChunks(input.toString(), callOptions, runManagers?.[0])) {
+                for await (const chunk of this._streamResponseChunks(prompt.toString(), callOptions, runManagers?.[0])) {
                     if (!generation) {
                         generation = chunk;
                     }

System Info

ProductName: macOS
ProductVersion: 12.7.5
BuildVersion: 21H1222
NodeVersion: v22.3.0
NPMVersion: 10.8.1
LangChainVersion: 0.2.6

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Jun 22, 2024
welljsjs added a commit to welljsjs/langchainjs that referenced this issue Jun 22, 2024
Copy link

dosubot bot commented Jun 22, 2024

Hey there, @welljsjs! 👋 I'm here to help you with any bugs, questions, or contributions you may have. Let's squash those bugs together!

To address the issue with the stream method in the llms.js file, you need to ensure that the correct prompt value is passed to the _streamResponseChunks method. Your patch correctly identifies the problem and provides a solution. Here is the relevant part of the patch:

diff --git a/node_modules/@langchain/core/dist/language_models/llms.js b/node_modules/@langchain/core/dist/language_models/llms.js
index 70466ae..06e0349 100644
--- a/node_modules/@langchain/core/dist/language_models/llms.js
+++ b/node_modules/@langchain/core/dist/language_models/llms.js
@@ -62,7 +62,7 @@ export class BaseLLM extends BaseLanguageModel {
                 text: "",
             });
             try {
-                for await (const chunk of this._streamResponseChunks(input.toString(), callOptions, runManagers?.[0])) {
+                for await (const chunk of this._streamResponseChunks(prompt.toString(), callOptions, runManagers?.[0])) {
                     if (!generation) {
                         generation = chunk;
                     }

This change ensures that the prompt value is correctly passed to the _streamResponseChunks method, which should resolve the issue with the stream method not producing the correct output [1].

To continue talking to Dosu, mention @dosu.

@welljsjs
Copy link
Contributor Author

@dosu I've created a PR #5874 which fixes this issue. Can you merge it ASAP?

Copy link

dosubot bot commented Jun 22, 2024

@jacoblee93, this one's a bit out of my wheelhouse, could you step in and provide your guidance?

To continue talking to Dosu, mention @dosu.

@jacoblee93
Copy link
Collaborator

Thanks for reporting and for the fix! Seems specifically for HumanMessagePromptTemplate and the other message prompt templates, but this should still definitely work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment