langgraph/tutorials/sql-agent/ #879

2024-06-28T17:12:49Z

giscus[bot]
bot Jun 28, 2024

langgraph/tutorials/sql-agent/

Build language agents as graphs

https://langchain-ai.github.io/langgraph/tutorials/sql-agent/

MatrixChess · 2024-06-28T17:12:50Z

MatrixChess
Jun 28, 2024 — with giscus

Hello, thanks for this amazing explanation. I'm trying to convert this sql agent to gemini llm and BigQuery but in the following step I'm receiving an error:

query_check_system = """You are a SQL expert with a strong attention to detail.
Double check the BigQuery query for common mistakes, including:
- Using NOT IN with NULL values
- Using UNION when UNION ALL should have been used
- Using BETWEEN for exclusive ranges
- Data type mismatch in predicates
- Properly quoting identifiers
- Using the correct number of arguments for functions
- Casting to the correct data type
- Using the proper columns for joins

If there are any of the above mistakes, rewrite the query. If there are no mistakes, just reproduce the original query.

You will call the appropriate tool to execute the query after running this check."""

query_check_prompt = ChatPromptTemplate.from_messages(
    [("system", query_check_system), ("placeholder", "{messages}")]
)

query_check = query_check_prompt | ChatGoogleGenerativeAI(temperature=0, model="gemini-pro").bind_tools(
    [db_query_tool], tool_choice="required"
)

query_check.invoke({"messages": [("user", "SELECT COUNT(id_cliente) FROM my_project.my_dataset.cliente;")]})

Produces this error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/proto/message.py", line 729, in __init__
    pb_value = marshal.to_proto(pb_type, value)
  File "/usr/local/lib/python3.10/dist-packages/proto/marshal/marshal.py", line 235, in to_proto
    pb_value = self.get_rule(proto_type=proto_type).to_proto(value)
  File "/usr/local/lib/python3.10/dist-packages/proto/marshal/rules/message.py", line 36, in to_proto
    return self._descriptor(**value)
ValueError: unknown enum label "any"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/sqlchain/app_langgraph_base.py", line 112, in <module>
    query_check.invoke({"messages": [("user", "SELECT COUNT(id_cliente) FROM my_project.my_dataset.cliente;")]})
  File "/home/user/.local/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 2507, in invoke
    input = step.invoke(input, config)
  File "/home/user/.local/lib/python3.10/site-packages/langchain_core/runnables/base.py", line 4588, in invoke
    return self.bound.invoke(
  File "/home/user/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 248, in invoke
    self.generate_prompt(
  File "/home/user/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 677, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
  File "/home/user/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 534, in generate
    raise e
  File "/home/user/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 524, in generate
    self._generate_with_cache(
  File "/home/user/.local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 749, in _generate_with_cache
    result = self._generate(
  File "/home/user/.local/lib/python3.10/site-packages/langchain_google_genai/chat_models.py", line 758, in _generate
    request = self._prepare_request(
  File "/home/user/.local/lib/python3.10/site-packages/langchain_google_genai/chat_models.py", line 907, in _prepare_request
    formatted_tool_config = ToolConfig(
  File "/usr/local/lib/python3.10/dist-packages/proto/message.py", line 757, in __init__
    pb_value = marshal.to_proto(pb_type, value)
  File "/usr/local/lib/python3.10/dist-packages/proto/marshal/marshal.py", line 235, in to_proto
    pb_value = self.get_rule(proto_type=proto_type).to_proto(value)
  File "/usr/local/lib/python3.10/dist-packages/proto/marshal/rules/message.py", line 36, in to_proto
    return self._descriptor(**value)
ValueError: unknown enum label "any"

What am I missing?

Thanks,
Alessio

0 replies

3x10RaiseTo8 · 2024-06-29T07:11:39Z

3x10RaiseTo8
Jun 29, 2024 — with giscus

Can't we just directly import tools from langchain_community.tools.sql_database.tool and not have to do this dirty hack with next(), list comprehension and getting tools from toolkit?

4 replies

matamorosjuan1 Jul 3, 2024 — with giscus

I would say that technically yes. But with doing that, you are adding the tools to the agent and allowing it to self-select the tool to use. Here, with LangGraph, you are able to better control the trajectory of the tools used

3x10RaiseTo8 Jul 4, 2024

There might be confusion here. For example, the list_tables_tool points to the same code as if we import from langchain_community.tools.sql_database.tool. It is then wrapper into ToolNode to make it a node with fallback.

EstephaniaCalvoC Oct 5, 2024 — with giscus

I think it's possible, but you will need to build the tools by hand, practically the same as SQLDatabaseToolkit does (source code)

waics Dec 4, 2024 — with giscus

I encountered an issue where the model_check_query was throwing an error during execution. Upon investigation, I discovered that some chatmodel instances require a HumanMessage as input; if only an AIMessage is provided, it results in an error. Consequently, I modified the type of query_check_system to human.

from langchain_core.prompts import ChatPromptTemplate

query_check_system = """You are a SQL expert with a strong attention to detail.
Double check the SQLite query for common mistakes, including:
- Using NOT IN with NULL values
- Using UNION when UNION ALL should have been used
- Using BETWEEN for exclusive ranges
- Data type mismatch in predicates
- Properly quoting identifiers
- Using the correct number of arguments for functions
- Casting to the correct data type
- Using the proper columns for joins

If there are any of the above mistakes, rewrite the query. If there are no mistakes, just reproduce the original query.

You will call the appropriate tool to execute the query after running this check."""

query_check_prompt = ChatPromptTemplate.from_messages(
    [("human", query_check_system), ("placeholder", "{messages}")] # here I changed the 'system' to 'human'
)
query_check = query_check_prompt | ChatOpenAI(model="gpt-4o", temperature=0).bind_tools(
    [db_query_tool], tool_choice="required"
)

query_check.invoke({"messages": [("user", "SELECT * FROM Artist LIMIT 10;")]})

ashen007 · 2024-07-12T14:17:05Z

ashen007
Jul 12, 2024 — with giscus

i was experience a slight problem while using this code, which query gen node return results from get_schema_tool without creating a query, solution for this problem was adding db_query_tool as a tool to query_gen!! 🤖🤖

4 replies

patrickcteng Sep 3, 2024 — with giscus

Hi, I'm encountered the same issue. Can you show what you mean by "...adding db_query_tool as a tool to query_gen..."?

Thanks in advance!

ashen007 Sep 8, 2024

what i noticed is query gen just use the context coming from sample data. So I did bind the db_query_tool into query gen like this, and also I removed all the tools mentioned like list_tables_tool, model_get_schema, get_schema_tool. i gave the same details using those in the prompt in using normal Pythonic way. reason to remove the mentioned tools with some nodes to achieve speed and generate the correct query within one iteration! i mean I customize this a lot to achieve what I want.. hope this will help.

    ("system", query_gen_system.format(SCHEMA=table_schemas,
                                       agg_exp=agg_exps,
                                       doc_exp=doc_exps)),
    ("placeholder", "{messages}")
])

query_gen = query_gen_prompt | llm.bind_tools(
    [db_query_tool], tool_choice="db_query_tool"
)

kuguadawang12138 Sep 24, 2024 — with giscus

I also found this problem, I think I can start from should continue, I don't quite understand why there are toolcalls and direct output, so I want to change the judgment condition there and let it go correct and then execute that way, do you think?I also think it's okay for you to do this, and then submit it there and change it to END, or not?

EstephaniaCalvoC Oct 6, 2024 — with giscus

For me, it was a bit of a pain to solve that without modifying the structure too much. I detected that the issue comes from the query_gen chain because it always calls the SubmitFinalAnswer tool, so it never creates a SQL query and executes it. At first, I thought it was a prompt issue, but after some iterations, I just gave up. As I understand it, if we have a model with tools, it should be able to respond with a normal answer if the tool is not marked as required.

So, instead of binding a tool, I just changed the prompt, the conditional function, and the query_gen_node.

I'll leave the changes here if somebody has trouble following the tutorial, but there are better ways to create a SQL Agent.

QUERY_GEN_INSTRUCTION = """You are a SQL expert with a strong attention to detail.

You can define SQL queries, analyze queries results and interpretate query results to response an answer.

Read the messages bellow and identify the user question, table schemas, query statement and query result, or error if they exist.

1. If there's not any query result that make sense to answer the question, create a syntactically correct SQLite query to answer the user question. DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.

2. If you create a query, response ONLY the query statement. For example, "SELECT id, name FROM pets;"

3. If a query was already executed, but there was an error. Response with the same error message you found. For example: "Error: Pets table doesn't exist"

4. If a query was already executed successfully interpretate the response and answer the question following this pattern: Answer: <<question answer>>. For example: "Answer: There three cats registered as adopted"
"""

# Chain
query_gen_prompt = ChatPromptTemplate.from_messages([("system", QUERY_GEN_INSTRUCTION), ("placeholder", "{messages}")])
query_gen = query_gen_prompt | ChatOpenAI(model="gpt-4o-mini")

# Node
def query_gen_node(state: State):
  message = query_gen_chain.invoke(state)
  return {"messages": [message]}

# Conditional Function
def should_continue(state: State) -> Literal[END, "correct_query", "query_gen"]:
    messages = state["messages"]
    
    last_message = messages[-1]
    if last_message.content.startswith("Answer:"):
        return END
    if last_message.content.startswith("Error:"):
        return "query_gen"
    else:
      return "correct_query"

SleepingSteven · 2024-09-03T09:20:10Z

SleepingSteven
Sep 3, 2024 — with giscus

I think that the approach advised in the tutorial provided on the langchain docs is more straight forward and more efficiently exploits the agent / graph architecture. Using create_react_agent allows to handle more complex db schemas and tool_calls in an efficient way. Also, it relies on a scientific framework described in literature, and it's easier to integrate in multi-agent applications. Despite seeing why someone would choose to follow a more fixed implementation like this one for some use cases, I don't think it should be the go-to and I don't understand why the tutorials for SQL-agents change so often

0 replies

DjangoPeng · 2024-09-08T08:15:16Z

DjangoPeng
Sep 8, 2024 — with giscus

NEVER get the right results.

I have tried more than 30 times and optimized prompts for all nodes.

What can I do to fix it?

================================ Human Message =================================

Which sales agent made the most in sales in 2009?
================================== Ai Message ==================================
Tool Calls:
  sql_db_list_tables (sql_list_tables)
 Call ID: sql_list_tables
  Args:
================================= Tool Message =================================
Name: sql_db_list_tables

Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
================================== Ai Message ==================================
Tool Calls:
  sql_db_schema (call_XvQxoN1g2nCW5T0YwYhNkiKY)
 Call ID: call_XvQxoN1g2nCW5T0YwYhNkiKY
  Args:
    table_names: Employee, Invoice
================================= Tool Message =================================
Name: sql_db_schema


CREATE TABLE "Employee" (
	"EmployeeId" INTEGER NOT NULL, 
	"LastName" NVARCHAR(20) NOT NULL, 
	"FirstName" NVARCHAR(20) NOT NULL, 
	"Title" NVARCHAR(30), 
	"ReportsTo" INTEGER, 
	"BirthDate" DATETIME, 
	"HireDate" DATETIME, 
	"Address" NVARCHAR(70), 
	"City" NVARCHAR(40), 
	"State" NVARCHAR(40), 
	"Country" NVARCHAR(40), 
	"PostalCode" NVARCHAR(10), 
	"Phone" NVARCHAR(24), 
	"Fax" NVARCHAR(24), 
	"Email" NVARCHAR(60), 
	PRIMARY KEY ("EmployeeId"), 
	FOREIGN KEY("ReportsTo") REFERENCES "Employee" ("EmployeeId")
)

/*
3 rows from Employee table:
EmployeeId	LastName	FirstName	Title	ReportsTo	BirthDate	HireDate	Address	City	State	Country	PostalCode	Phone	Fax	Email
1	Adams	Andrew	General Manager	None	1962-02-18 00:00:00	2002-08-14 00:00:00	11120 Jasper Ave NW	Edmonton	AB	Canada	T5K 2N1	+1 (780) 428-9482	+1 (780) 428-3457	andrew@chinookcorp.com
2	Edwards	Nancy	Sales Manager	1	1958-12-08 00:00:00	2002-05-01 00:00:00	825 8 Ave SW	Calgary	AB	Canada	T2P 2T3	+1 (403) 262-3443	+1 (403) 262-3322	nancy@chinookcorp.com
3	Peacock	Jane	Sales Support Agent	2	1973-08-29 00:00:00	2002-04-01 00:00:00	1111 6 Ave SW	Calgary	AB	Canada	T2P 5M5	+1 (403) 262-3443	+1 (403) 262-6712	jane@chinookcorp.com
*/


CREATE TABLE "Invoice" (
	"InvoiceId" INTEGER NOT NULL, 
	"CustomerId" INTEGER NOT NULL, 
	"InvoiceDate" DATETIME NOT NULL, 
	"BillingAddress" NVARCHAR(70), 
	"BillingCity" NVARCHAR(40), 
	"BillingState" NVARCHAR(40), 
	"BillingCountry" NVARCHAR(40), 
	"BillingPostalCode" NVARCHAR(10), 
	"Total" NUMERIC(10, 2) NOT NULL, 
	PRIMARY KEY ("InvoiceId"), 
	FOREIGN KEY("CustomerId") REFERENCES "Customer" ("CustomerId")
)

/*
3 rows from Invoice table:
InvoiceId	CustomerId	InvoiceDate	BillingAddress	BillingCity	BillingState	BillingCountry	BillingPostalCode	Total
1	2	2009-01-01 00:00:00	Theodor-Heuss-Straße 34	Stuttgart	None	Germany	70174	1.98
2	4	2009-01-02 00:00:00	Ullevålsveien 14	Oslo	None	Norway	0171	3.96
3	8	2009-01-03 00:00:00	Grétrystraat 63	Brussels	None	Belgium	1000	5.94
*/
================================== Ai Message ==================================
Tool Calls:
  SubmitFinalAnswer (call_9aHX9bvbYriVSSGxrZJz5GoO)
 Call ID: call_9aHX9bvbYriVSSGxrZJz5GoO
  Args:
    final_answer: The database schema does not provide a direct link between sales agents and invoices. Therefore, I don't have enough information to determine which sales agent made the most in sales in 2009.

7 replies

patrikfajnar Sep 10, 2024 — with giscus

There is a mistake in the tutorial. It gave me a headache too, but I have the solution. You need to add model_check_query when you initialize the query_gen variable, so the AI can call that tool. Like this:

query_gen = query_gen_prompt | ChatOpenAI(model="gpt-4o", temperature=0).bind_tools(
[SubmitFinalAnswer, model_check_query]
)

DjangoPeng Sep 11, 2024

@patrickcteng Thanks for your reply.

After binding the tool model_check_query, I reached the query_gen node finally.

From LangSmith Trace and the log output, after three attempts, the generated SQL code is still unusable, and the Graph execution is blocked without any known reason. Have you encountered this situation before?

================================= Tool Message =================================

Error: The wrong tool was called: sql_db_schema. Please fix your mistakes. Remember to only call SubmitFinalAnswer to submit the final answer. Generated queries should be outputted WITHOUT a tool call.
================================= Tool Message =================================

Error: The wrong tool was called: model_check_query. Please fix your mistakes. Remember to only call SubmitFinalAnswer to submit the final answer. Generated queries should be outputted WITHOUT a tool call.
================================== Ai Message ==================================

To find out which sales agent made the most in sales in 2009, we need to join the `Employee`, `Customer`, and `Invoice` tables. We will filter the invoices by the year 2009 and group the results by the sales agent to sum up the total sales.

Here is the query to achieve this:

\```sql
SELECT 
    e.FirstName || ' ' || e.LastName AS SalesAgent,
    SUM(i.Total) AS TotalSales
FROM 
    Employee e
JOIN 
    Customer c ON e.EmployeeId = c.SupportRepId
JOIN 
    Invoice i ON c.CustomerId = i.CustomerId
WHERE 
    strftime('%Y', i.InvoiceDate) = '2009'
GROUP BY 
    SalesAgent
ORDER BY 
    TotalSales DESC
LIMIT 1;
\```

Let's execute this query.
Tool Calls:
  SubmitFinalAnswer (call_3SH6DxnnCP2iqGtp8ABHCsFO)
 Call ID: call_3SH6DxnnCP2iqGtp8ABHCsFO
  Args:
    final_answer: SELECT 
    e.FirstName || ' ' || e.LastName AS SalesAgent,
    SUM(i.Total) AS TotalSales
FROM 
    Employee e
JOIN 
    Customer c ON e.EmployeeId = c.SupportRepId
JOIN 
    Invoice i ON c.CustomerId = i.CustomerId
WHERE 
    strftime('%Y', i.InvoiceDate) = '2009'
GROUP BY 
    SalesAgent
ORDER BY 
    TotalSales DESC
LIMIT 1;

DjangoPeng Sep 11, 2024 — with giscus

It's amazing. Without modifying the code, after restarting the Jupyter Kernel and running it again, the last AI message was not blocked; instead, it executed the SQL query and got the correct result.

\```sql
SELECT 
    e.FirstName || ' ' || e.LastName AS SalesAgent,
    SUM(i.Total) AS TotalSales
FROM 
    Employee e
JOIN 
    Customer c ON e.EmployeeId = c.SupportRepId
JOIN 
    Invoice i ON c.CustomerId = i.CustomerId
WHERE 
    strftime('%Y', i.InvoiceDate) = '2009'
GROUP BY 
    SalesAgent
ORDER BY 
    TotalSales DESC
LIMIT 1;
\```

Let's execute this query.
Tool Calls:
  SubmitFinalAnswer (call_9mnMHzU2akfoTIBjsz2YxmYu)
 Call ID: call_9mnMHzU2akfoTIBjsz2YxmYu
  Args:
    final_answer: Jane Peacock

CubeStar1 Sep 24, 2024

Hey, how did you fix it? I binded the tool model_check_query to query_gen but I'm still stuck at the query_gen step. It doesn't call the db_query_tool

gtnpromtior Oct 22, 2024 — with giscus

pls answer, how did you fix it ?

madhubandru · 2024-09-17T17:26:59Z

madhubandru
Sep 17, 2024

I am using the code from https://langchain-ai.github.io/langgraph/tutorials/sql-agent/ tutorial as it is with my SQL database. But I am facing error as Error: The wrong tool was called: sql_query. Please fix your mistakes. Remember to only call SubmitFinalAnswer to submit the final answer. Generated queries should be outputted WITHOUT a tool call.

In the below edge definitions, it is able to list all the tables and also getting the right table for my prompt. It is using sql_query tool to generate the required query, and when its calling query_gen, hitting error saying tool name(sql_query) is not matching with SubmitFinalAnswer (provided full error above).

How to fix this? Let me know if any inputs or other details needed. Thank you in advance!

workflow.add_edge("first_tool_call", "list_tables_tool")
workflow.add_edge("list_tables_tool", "model_get_schema")
workflow.add_edge("model_get_schema", "get_schema_tool")
workflow.add_edge("get_schema_tool", "query_gen")
workflow.add_conditional_edges(
    "query_gen",
    should_continue,
)
workflow.add_edge("correct_query", "execute_query")
workflow.add_edge("execute_query", "query_gen")

Below is the event(printed) when query_gen is called.

{'query_gen': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_AezeKGb7Sp8SGtsyACvO9tUxfg', 'function': {'arguments': '{"query":"SELECT COUNT(*) AS NumberOfDeaths FROM Titanic_Data WHERE Sur = 0;"}', 'name': 'sql_query'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 28, 'prompt_tokens': 1153, 'total_tokens': 1181}, 'model_name': 'gpt-4', 'system_fingerprint': 'fp_e49e4201a9', 'prompt_filter_results': [{'prompt_index': 0, 'content_filter_results': {'hate': {'filtered': False, 'severity': 'safe'}, 'jailbreak': {'filtered': False, 'detected': False}, 'profanity': {'filtered': False, 'detected': False}, 'self_harm': {'filtered': False, 'severity': 'safe'}, 'sexual': {'filtered': False, 'severity': 'safe'}, 'violence': {'filtered': False, 'severity': 'safe'}}}], 'finish_reason': 'tool_calls', 'logprobs': None, 'content_filter_results': {}}, id='run-21a0013a-1002-4fd9-9713-1f300567a096-0', tool_calls=[{'name': 'sql_query', 'args': {'query': 'SELECT COUNT(*) AS NumberOfDeaths FROM Titanic_Data WHERE Sur = 0;'}, 'id': 'call_AezeKGb7Sp8SGtsyACvO9tUxfg', 'type': 'tool_call'}], usage_metadata={'input_tokens': 1153, 'output_tokens': 28, 'total_tokens': 1181}), ToolMessage(content='Error: The wrong tool was called: sql_query. Please fix your mistakes. Remember to only call SubmitFinalAnswer to submit the final answer. Generated queries should be outputted WITHOUT a tool call.', id='ed9deeba-66cd-4076-bb39-ad9084a76666', tool_call_id='call_AezeKGb7Sp8SGtsyACvO9tUxfg')]}}

10 replies

ashen007 Sep 27, 2024

@madhubandru this cuz your customization isn't complete. your tool binding returns a tool message so the next step is to call that tool. if you don't want to call the tool do not bind those and get SQL query as model output. one option to get the only thing you want from llm is to use structured output. to do so you need to use a structured output support chat model from langchain create a pedantic class and use it with with_structured_output like in here -> https://python.langchain.com/v0.1/docs/modules/model_io/chat/structured_output/

madhubandru Sep 27, 2024

@ashen007 Below is the code for my customization of the graph and the error I am encountering. Please suggest to me what is wrong here.

def create_tool_node_with_fallback(tools: list) -> RunnableWithFallbacks[Any, dict]:
    """
    Create a ToolNode with a fallback to handle errors and surface them to the agent.
    """
    return ToolNode(tools).with_fallbacks(
        [RunnableLambda(handle_tool_error)], exception_key="error"
    )


def handle_tool_error(state) -> dict:
    error = state.get("error")
    tool_calls = state["messages"][-1].tool_calls
    return {
        "messages": [
            ToolMessage(
                content=f"Error: {repr(error)}\n please fix your mistakes.",
                tool_call_id=tc["id"],
            )
            for tc in tool_calls
        ]
    }

from langchain_community.agent_toolkits import SQLDatabaseToolkit

toolkit = SQLDatabaseToolkit(db=db, llm=llm)
tools = toolkit.get_tools()

list_tables_tool = next(tool for tool in tools if tool.name == "sql_db_list_tables")
get_schema_tool = next(tool for tool in tools if tool.name == "sql_db_schema")

from langchain_core.tools import tool

# Define a tool to execute SQL queries to check for correctness  
@tool  
def db_query_tool(query: str) -> str:  
    """  
    Execute a SQL query against the database and get back the result.
    If the query is not correct, an error message will be returned.
    If an error is returned, rewrite the query, check the query, and try again. 
    """  
    result = db.run_no_throw(query)
    if result.startswith("Error:"):
        # return result
        return "Error: Query execution failed. Please rewrite your query."  
    return str(query) 

from langchain_core.prompts import ChatPromptTemplate

query_check_system = """You are a SQL expert with a strong attention to detail.
Double check the SQL query for common mistakes, including:
- Using NOT IN with NULL values
- Using UNION when UNION ALL should have been used
- Using BETWEEN for exclusive ranges
- Data type mismatch in predicates
- Properly quoting identifiers
- Using the correct number of arguments for functions
- Casting to the correct data type
- Using the proper columns for joins

If there are any of the above mistakes, rewrite the query. If there are no mistakes, just reproduce the original query.

You will call the appropriate tool to execute the query after running this check.""" 

query_check_prompt = ChatPromptTemplate.from_messages(  
    [("system", query_check_system), ("placeholder", "{messages}")]  
)  

# Bind the execute_sql_query tool for checking  
query_check = query_check_prompt | llm.bind_tools(  
    [db_query_tool], tool_choice="required"  
)


# Define the state for the agent
class State(TypedDict):
    messages: Annotated[list[AnyMessage], add_messages]


# Define a new graph
workflow = StateGraph(State)


# Add a node for the first tool call
def first_tool_call(state: State) -> dict[str, list[AIMessage]]:
    return {
        "messages": [
            AIMessage(
                content="",
                tool_calls=[
                    {
                        "name": "sql_db_list_tables",
                        "args": {},
                        "id": "tool_abcd123",
                    }
                ],
            )
        ]
    }


def model_check_query(state: State) -> dict[str, list[AIMessage]]:
    """
    Use this tool to double-check if your query is correct before executing it.
    """
    return {"messages": [query_check.invoke({"messages": [state["messages"][-1]]})]}


workflow.add_node("first_tool_call", first_tool_call)

# Add nodes for the first two tools
workflow.add_node(
    "list_tables_tool", create_tool_node_with_fallback([list_tables_tool])
)
workflow.add_node("get_schema_tool", create_tool_node_with_fallback([get_schema_tool]))

# Add a node for a model to choose the relevant tables based on the question and available tables
model_get_schema = llm.bind_tools(
    [get_schema_tool]
)
workflow.add_node(
    "model_get_schema",
    lambda state: {
        "messages": [model_get_schema.invoke(state["messages"])],
    },
)


# Describe a tool to collect the SQL statements  
class CollectSQLStatements(BaseModel):  
    """Collect the final list of SQL SELECT statements to answer the user's question."""  

    sql_statements: list = Field(  
        ..., description="The list of SQL SELECT statements generated based on the user's question"  
    )    

# Modify the query generation prompt to generate SQL statements without executing them  
query_gen_system = f"""You are a SQL expert with a strong attention to detail.  

Given an input question, output the SQL SELECT statements needed to retrieve the necessary data to answer the question.  

Requirements:  
- Only generate SQL SELECT statements.  
- Make sure to include any necessary JOINs.  
- Do not include any data manipulation statements (INSERT, UPDATE, DELETE, etc.).  
- Ensure the SQL statements are syntactically correct and optimized.
- Do not aggregate the data; retrieve the raw data needed.  
- After generating the SQL statements, you need to check if they run successfully by executing them.  
- If they run successfully, call the appropriate tool to submit them as a list.  
- If there are errors or issues during execution, rewrite the SQL statements.

After generating the final SQL statements, call the appropriate tool to submit them as a list.""" 

query_gen_prompt = ChatPromptTemplate.from_messages(
    [("system", query_gen_system), ("placeholder", "{messages}")]
)  
   
# Bind the db_query_tool and CollectSQLStatements tools  
query_gen = query_gen_prompt | llm.bind_tools(  
    [CollectSQLStatements]
)  

def query_gen_node(state: State):  
    message = query_gen.invoke(state)
    tool_messages = []
    if message.tool_calls:
        for tc in message.tool_calls:  
            if tc["name"] not in  ["CollectSQLStatements"]:  
                tool_messages.append(  
                    ToolMessage(  
                        content=f"Error: The wrong tool was called: {tc['name']}. Please fix your mistakes. Remember to use CollectSQLStatements to submit SQL statements.",  
                        tool_call_id=tc["id"],  
                    )  
                )  
    else:
        tool_messages = []
    return {"messages": [message] + tool_messages}

workflow.add_node("query_gen", query_gen_node)

# Add a node for the model to check the query before executing it
workflow.add_node("correct_query", model_check_query)

def should_continue(state: State) -> Literal[END, "query_gen",]:  
    messages = state["messages"]  
    last_message = messages[-1]  
    ### If there is a tool call to CollectSQLStatements, then finish  
    if getattr(last_message, "tool_calls", None):  
        tc_names = [tc["name"] for tc in last_message.tool_calls]  
        if "CollectSQLStatements" in tc_names:  
            return END
    else:
        return "query_gen"

# Specify the edges between the nodes 
workflow.add_edge(START, "first_tool_call")
workflow.add_edge("first_tool_call", "list_tables_tool")
workflow.add_edge("list_tables_tool", "model_get_schema")  
workflow.add_edge("model_get_schema", "get_schema_tool")  
workflow.add_edge("get_schema_tool", "query_gen")
workflow.add_edge("query_gen", "correct_query")
workflow.add_conditional_edges(  
    "correct_query",  
    should_continue,  
)

# Compile the workflow into a runnable
app = workflow.compile()

Graph Flow:

ERROR:
Initially(3:10PM) got this error:

When retrying (at 4PM) with same code got this error: it's weird

ashen007 Sep 28, 2024

In my case, I also faced an issue that originated from an if statement in the query gen node! my approach was to generate SQL -> check if it is accurate -> if not re-direct to gen node -> if correct go to execution node -> get results node. this was done using a state graph and inserting each node output as a message in the graph state. in this approach, we do not need an if statement and tool call to get the final answer.

def query_gen_node(state: State):
    llm = init_llm()
    query_gen_chain = query_generation | llm.bind_tools([db_query_tool], tool_choice='db_query_tool')
    message = query_gen_chain.invoke(state)

    return {
        'messages': [message]
    }


def final_answer_node(state: State):
    last_message = state["messages"][-1]

    try:
        last_message = HumanMessage(content=last_message.content
                                    .replace("'", "\"")
                                    .replace("None", "\"no data\""))

        json_parser = JsonOutputParser()
        answer = json_parser.invoke(last_message)

    except:
        answer = last_message.content

    return {
        'messages': state['messages'],
        'answer': answer
    }


def should_continue(state: State) -> Literal["final_answer", "query_gen"]:
    messages = state["messages"]
    last_message = messages[-1]

    if last_message.content.startswith("Error:"):
        print('error in query, go into query gen node')
        return "query_gen"

    else:
        print('final answer was gathered, go into final answer node')
        return "final_answer"

madhubandru Sep 29, 2024

Thank you for looking into my issue. The code you provided executes without any errors, but the SQL statement generated by query_gen_node is not being validated/verified. It's giving whatever it is generating. I think the db_query_tool provided in query_gen_node is not being called even though I mentioned to use the tool in my system prompt.

Yairama Oct 6, 2024

@madhubandru If you're encountering issues with query generation, possibly due to the complexity of the table schema, I recommend using a dynamic few-shot approach by providing examples that reflect the complexity of the queries. One of our team members implemented this method, and it led to a significant improvement in results. :3

Adhikasurya · 2024-09-25T05:47:49Z

Adhikasurya
Sep 25, 2024 — with giscus

BadRequestError: Error code: 400 - {'error': {'message': 'invalid role: tool, role must be one of [system, user, assistant]', 'type': 'api_error', 'param': None, 'code': None}}

anyone ever encountered this error and have solved it? thanks.

11 replies

ashen007 Sep 25, 2024

@Adhikasurya use from langchain_ollama.chat_models import ChatOllama not llm class.

tool calling is available!

Adhikasurya Sep 25, 2024 — with giscus

thanks!!, it has binding tools now, but it seems to produce another error, i'll look it up later.

AttributeError: 'ModelPrivateAttr' object has no attribute 'chat'

Adhikasurya Sep 25, 2024 — with giscus

i guess it always works well by using OpenAI models, but always encountered errors while using local llm

Wijipedia Sep 30, 2024

Hi, @Adhikasurya, I also encountered the same problem: "AttributeError: 'ModelPrivateAttr' object has no attribute 'chat'", have you solved it?

Adhikasurya Sep 30, 2024 — with giscus

Hi, @Wijipedia, sorry i have stopped working on it. I have migrated to CrewAI and it works just fine and easy to customize

Yairama · 2024-10-06T18:36:55Z

Yairama
Oct 6, 2024

Some questions about the exaple:
Why the system prompt says: Given an input question, output a syntactically correct SQLite query to run, then look at the results of the query and return the answer. when the query_gen model runnable doesn't have a tool to execute the query? it only can generate it.

Why you need to return to query_gen after the execute_query node? execute_query would be connected directly to the end? or maybe a couple of conditional conections one to query_gen and other to __end__

It's execute_query necessary? maybe you can summarize correct_query and execute_query.

Please help me to clarify my indeas :3

6 replies

Yairama Oct 6, 2024

something like this?

credits to madhubandru post from this thread

EstephaniaCalvoC Oct 8, 2024

I was thinking on something like this:

Select the relevant tables to answer the questions and only retrieve those schemas / info.
If the question is not related to the db, answer with an error.
Generate a query and check it in the same node -> Return the checked query
Execute the query
If it fails try to generate again with the error information (I set a limit in the state to avoid an infinite loop)
If we have a result, answer the question.

Yairama Oct 8, 2024

Nice!

For our use case we made some like this:

Aquí tienes una versión mejorada del post en inglés, paso a paso, con una redacción más clara y profesional:

Steps:

Best Practices: Initial Tool Calls
As a "best practice" (xD), start by invoking the table listing tools, similar to the tutorial. This involves calling the list_tables tool using the first_tool_call Node and the list_tables_tool ToolNode.
Filtering Allowed Tables
We added a get_tables node because we filter the allowed tables based on the user. In this node, we filter specific tables using a parameter called _allowed_tables_ in our state. If no tables are allowed, the process simply ends.
Schema Retrieval
The get_schema_tool is self-explanatory — it retrieves the schema.
Loading Prompts Efficiently
The get_full_prompt node consolidates all the prompts into a single node. This should occur before the loop to ensure prompts are loaded only once. Additionally, this node handles dynamic few-shot examples.
Query Creation and Validation
The query_analyst node generates the query, while correct_query is responsible for validating its behavior. A few important points here:
- I'm not using tool calls but instead relying on structured output. This approach helps avoid adding a ToolMessage after each call. While it may not be ideal to use a Pydantic class in this way (I think), it results in cleaner code. I'm using this approach; not sure if it's best practice, but it’s more concise.

Here’s a simplified version of my Pydantic class:

class QueryReviewer(BaseModel):
    """
    Stores the evaluation of the query generated by the agent, including the final answer and the observations.
    """

    question: Annotated[str, Field(..., description="The main question that the query must answer.")]
    query: Annotated[str, Field(..., description="The SQL query, must be in valid MS SQL syntax.")]
    opinion: Annotated[
        str,
        Field(
            ...,
            description="The opinion about the query, to know whether or not it meets the requirement of the question",
        ),
    ]
    errors: Annotated[
        Optional[List[str]],
        Field(
            default_factory=list,
            description="A list of errors found in the query, if any. Can be null if there are no errors.",
        ),
    ] = None
    steps_to_fix: Annotated[
        Optional[List[str]],
        Field(
            default_factory=list,
            description="A list of steps to fix the errors found in the query, if any. Can be null if no steps are needed.",
        ),
    ] = None
    df_name: Annotated[
        str, Field(..., description="The snake_case name for the dataframe that will hold the query result.")
    ]
    conclusion: Optional[Literal["correct", "incorrect"]] = Field(
        None,
        description="The conclusion about the query. Must be 'correct', 'incorrect', or 'none'. Can be null if the conclusion is not determined.",
    )

    @field_validator("df_name")
    def validate_snake_case(cls, value):
        """
        Validates that df_name is in snake_case format.
        """
        if not re.match(r"^[a-z_][a-z0-9_]*$", value):
            raise ValueError("df_name must be in snake_case, starting with a lowercase letter or underscore.")
        return value

The errors and steps_to_fix fields are particularly helpful for quickly converging on a solution.

Query Validation with Correct Query
In the correct_query node, we validate the query by testing it and saving the results to the QueryReviewer class. The query_analyst also writes to this class. Additionally, I maintain a query_try state variable that increments with each loop. If it exceeds 6 attempts, the process ends; otherwise, it loops back to query_analyst or moves on to generate_raw_data_frame depends on results.
Generating the DataFrame
The generate_raw_data_frame node simply creates a dataframe (df) xd.

I'm completely open to feedback and constructive criticism!

EstephaniaCalvoC Oct 8, 2024

I'm not totally sure about your use case, but that approach makes sense to me, we actually think on something similar. 🙌

Here is my feedback

I think the example forces listing the tables with the first_tool_call because it relies on the LLM to choose the correct tool for database interactions. However, you could call SQLDatabase methods directly, as you did it in step 5.
Ensure you have a mechanism to limit the loop of query_analyst and correct_query nodes.
I'm not sure if you are dynamically deciding which tables are allowed but it's a good practice don't modify private attributes, so you could use the include_tables when you instantiate the SQLDatabase object.
If your database is too large, consider using the lazy_table_reflection parameter when instantiating the SQLDatabase.
For real projects, it's better to create a specific user for the agent and grant permission to only read the tables you want. I wouldn't rely too much on _allowed_tables as a security guard. As I understand it, SQLDatabase uses it to get the metadata of the usable tables, which is nice for efficient information retrieval, but the LLM can take a foreign key, infer the table name, and try to call it even if is not in the _allowed_tables list.

In the approach I'm working on I'm not using any tool. The happy path is working but it is progress yet... Maybe it could give you some ideas, feel free to comment. 😊 -> Notebook: Custom SQL Agent

Yairama Oct 8, 2024

I appreciate your feedback! Thank you very much.

gtnpromtior · 2024-10-23T00:44:38Z

gtnpromtior
Oct 23, 2024 — with giscus

Guys I'm having a problem, the agent create the correct query (I confirm it in dbeaver) but I keep receiving this error "content: "Error: Query failed. Please rewrite your query and try again.", and it start looping again. Can somebody help ?

4 replies

Yairama Oct 23, 2024

Two posible options, your query is always bad or you are not updating the corrected query in the state or tool.

gtnpromtior Oct 23, 2024 — with giscus

I tested the query on dbeaver and it's working perfectly, then I test with this:
query = """ QUERY """
db_query_tool.invoke(query)
and it fail when I use - INTERVAL '3 hours' for example, o something related to dates, I'm using redshift from amazon.

ashen007 Oct 23, 2024

you should add some time-related SQL queries as a few shot examples

Yairama Oct 23, 2024

If you have complex query's maybe you need to give some few shots, also don't forget to pass the table schema and some useful data that helps the llm to create a correct query.

gtnpromtior · 2024-10-25T01:16:12Z

gtnpromtior
Oct 25, 2024 — with giscus

Does somebody test it with 4o-mini ? because when using it, the agent doesn't execute the query, only answer with the query itself

2 replies

EstephaniaCalvoC Nov 3, 2024

I had a similar problem... I could fix it as I explained here ... You could also see the full fix in the SQL Agent (LangGraph) -Fixed version Notebook.

gtnpromtior Nov 4, 2024 — with giscus

I tryied your soution and didn't work, maybe our use case are a bit different

hrithikkoduri · 2024-10-31T21:32:12Z

hrithikkoduri
Oct 31, 2024 — with giscus

Hey, so I found the issue in this one and was able to correct it. In the query_gen_node when it tries to create a query, it notices that it doesnt have the schema for the other tables it needs inorder to join and fetch the results. So I simple did this while binding the tools to the model.-

llm_gen_tools = llm.bind_tools(
[SubmitFinalAnswer, get_schema_tool],

)

query_gen = query_gen_prompt | llm_gen_tools

And in the system instruction I changed this part -
" DO NOT call any tool apart from SubmitFinalAnswer to submit the final answer"

to

"
Call get_schema_tool to get the schema of the table that is relevant to the query.

Call SubmitFinalAnswer to submit the final answer."

It worked for me! Hope it helps you as well!

1 reply

gtnpromtior Nov 3, 2024 — with giscus

mine works perfectly fine, what error did you fix ?

codearb · 2024-11-03T14:39:45Z

codearb
Nov 3, 2024 — with giscus

UnprocessableEntityError: status_code: 422, body: data=None message='No valid tool call or response generated'

{'first_tool_call': {'messages': [AIMessage(content='', additional_kwargs={}, response_metadata={}, id='056cef8e-59c8-4b41-bf40-c8493d756c66', tool_calls=[{'name': 'sql_db_list_tables', 'args': {}, 'id': 'tool_abcd123', 'type': 'tool_call'}])]}}
{'list_tables_tool': {'messages': [ToolMessage(content='Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track', name='sql_db_list_tables', id='8cff1844-a66c-42bb-8440-6c8083bd51af', tool_call_id='tool_abcd123')]}}
{'model_get_schema': {'messages': [AIMessage(content="I have found that the database contains a table called 'Employee'. I will now search for the schema of this table to see if it contains the information I need to answer the question.", additional_kwargs={'documents': None, 'citations': None, 'search_results': None, 'search_queries': None, 'is_search_required': None, 'generation_id': '48298701-5902-479f-bd01-b52324dfef25', 'tool_calls': [{'id': 'bce4ae89ca5e4e56a4ae56e6977dc30f', 'function': {'name': 'sql_db_schema', 'arguments': '{"table_names": "Employee"}'}, 'type': 'function'}], 'token_count': {'input_tokens': 1074.0, 'output_tokens': 83.0}}, response_metadata={'documents': None, 'citations': None, 'search_results': None, 'search_queries': None, 'is_search_required': None, 'generation_id': '48298701-5902-479f-bd01-b52324dfef25', 'tool_calls': [{'id': 'bce4ae89ca5e4e56a4ae56e6977dc30f', 'function': {'name': 'sql_db_schema', 'arguments': '{"table_names": "Employee"}'}, 'type': 'function'}], 'token_count': {'input_tokens': 1074.0, 'output_tokens': 83.0}}, id='run-70c1f74d-2f0b-420f-a741-3d7acf463612-0', tool_calls=[{'name': 'sql_db_schema', 'args': {'table_names': 'Employee'}, 'id': 'a15113a0c7f94dea8736ac51d3c488ed', 'type': 'tool_call'}], usage_metadata={'input_tokens': 1074, 'output_tokens': 83, 'total_tokens': 1157})]}}
{'get_schema_tool': {'messages': [ToolMessage(content='\nCREATE TABLE "Employee" (\n\t"EmployeeId" INTEGER NOT NULL, \n\t"LastName" NVARCHAR(20) NOT NULL, \n\t"FirstName" NVARCHAR(20) NOT NULL, \n\t"Title" NVARCHAR(30), \n\t"ReportsTo" INTEGER, \n\t"BirthDate" DATETIME, \n\t"HireDate" DATETIME, \n\t"Address" NVARCHAR(70), \n\t"City" NVARCHAR(40), \n\t"State" NVARCHAR(40), \n\t"Country" NVARCHAR(40), \n\t"PostalCode" NVARCHAR(10), \n\t"Phone" NVARCHAR(24), \n\t"Fax" NVARCHAR(24), \n\t"Email" NVARCHAR(60), \n\tPRIMARY KEY ("EmployeeId"), \n\tFOREIGN KEY("ReportsTo") REFERENCES "Employee" ("EmployeeId")\n)\n\n/\n3 rows from Employee table:\nEmployeeId\tLastName\tFirstName\tTitle\tReportsTo\tBirthDate\tHireDate\tAddress\tCity\tState\tCountry\tPostalCode\tPhone\tFax\tEmail\n1\tAdams\tAndrew\tGeneral Manager\tNone\t1962-02-18 00:00:00\t2002-08-14 00:00:00\t11120 Jasper Ave NW\tEdmonton\tAB\tCanada\tT5K 2N1\t+1 (780) 428-9482\t+1 (780) 428-3457\tandrew@chinookcorp.com\n2\tEdwards\tNancy\tSales Manager\t1\t1958-12-08 00:00:00\t2002-05-01 00:00:00\t825 8 Ave SW\tCalgary\tAB\tCanada\tT2P 2T3\t+1 (403) 262-3443\t+1 (403) 262-3322\tnancy@chinookcorp.com\n3\tPeacock\tJane\tSales Support Agent\t2\t1973-08-29 00:00:00\t2002-04-01 00:00:00\t1111 6 Ave SW\tCalgary\tAB\tCanada\tT2P 5M5\t+1 (403) 262-3443\t+1 (403) 262-6712\tjane@chinookcorp.com\n/', name='sql_db_schema', id='0bca0a10-b4fa-4e99-8e54-fa64ed204885', tool_call_id='a15113a0c7f94dea8736ac51d3c488ed')]}}

0 replies

imvbhuvan · 2024-11-04T09:32:32Z

imvbhuvan
Nov 4, 2024

Does this not work with GroqAPI ?

face this error - "

{
"name": "BadRequestError",
"message": "Error code: 400 - {'error': {'message': "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.", 'type': 'invalid_request_error', 'code': 'tool_use_failed', 'failed_generation': '{"tool\\_call":{"id":"pending","type":"function","function":{"name":"sql\\_db\\_schema"},"parameters":{"table\\_names":"Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track"}}}'}}",
"stack": "---------------------------------------------------------------------------
BadRequestError Traceback (most recent call last)
Cell In[54], line 1
----> 1 messages = app.invoke(
2 {"messages": [("user", "Which sales agent made the most in sales in 2009?")]}
3 )
4 json_str = messages["messages"][-1].tool_calls[0]["args"]["final_answer"]
5 json_str

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langgraph\pregel\init.py:1551, in Pregel.invoke(self, input, config, stream_mode, output_keys, interrupt_before, interrupt_after, debug, **kwargs)
1549 else:
1550 chunks = []
-> 1551 for chunk in self.stream(
1552 input,
1553 config,
1554 stream_mode=stream_mode,
1555 output_keys=output_keys,
1556 interrupt_before=interrupt_before,
1557 interrupt_after=interrupt_after,
1558 debug=debug,
1559 **kwargs,
1560 ):
1561 if stream_mode == "values":
1562 latest = chunk

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langgraph\pregel\init.py:1290, in Pregel.stream(self, input, config, stream_mode, output_keys, interrupt_before, interrupt_after, debug, subgraphs)
1279 # Similarly to Bulk Synchronous Parallel / Pregel model
1280 # computation proceeds in steps, while there are channel updates
1281 # channel updates from step N are only visible in step N+1
1282 # channels are guaranteed to be immutable for the duration of the step,
1283 # with channel updates applied only at the transition between steps
1284 while loop.tick(
1285 input_keys=self.input_channels,
1286 interrupt_before=interrupt_before_,
1287 interrupt_after=interrupt_after_,
1288 manager=run_manager,
1289 ):
-> 1290 for _ in runner.tick(
1291 loop.tasks.values(),
1292 timeout=self.step_timeout,
1293 retry_policy=self.retry_policy,
1294 get_waiter=get_waiter,
1295 ):
1296 # emit output
1297 yield from output()
1298 # emit output

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langgraph\pregel\runner.py:56, in PregelRunner.tick(self, tasks, reraise, timeout, retry_policy, get_waiter)
54 t = tasks[0]
55 try:
---> 56 run_with_retry(t, retry_policy)
57 self.commit(t, None)
58 except Exception as exc:

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langgraph\pregel\retry.py:29, in run_with_retry(task, retry_policy)
27 task.writes.clear()
28 # run the task
---> 29 task.proc.invoke(task.input, config)
30 # if successful, end
31 break

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langgraph\utils\runnable.py:385, in RunnableSeq.invoke(self, input, config, **kwargs)
383 context.run(_set_config_context, config)
384 if i == 0:
--> 385 input = context.run(step.invoke, input, config, **kwargs)
386 else:
387 input = context.run(step.invoke, input, config)

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langgraph\utils\runnable.py:167, in RunnableCallable.invoke(self, input, config, **kwargs)
165 else:
166 context.run(_set_config_context, config)
--> 167 ret = context.run(self.func, input, **kwargs)
168 if isinstance(ret, Runnable) and self.recurse:
169 return ret.invoke(input, config)

Cell In[52], line 12, in (state)
5 workflow.add_node("get_schema_tool", create_tool_node_with_fallback([get_schema_tool]))
6 model_get_schema = ChatGroq().bind_tools(
7 [get_schema_tool], tool_choice='sql_db_schema'
8 )
9 workflow.add_node(
10 "model_get_schema",
11 lambda state: {
---> 12 "messages": [model_get_schema.invoke(state["messages"])],
13 },
14 )
15 workflow.add_node("query_gen", query_gen_node)
16 workflow.add_node("correct_query", model_check_query)

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langchain_core\runnables\base.py:5354, in RunnableBindingBase.invoke(self, input, config, **kwargs)
5348 def invoke(
5349 self,
5350 input: Input,
5351 config: Optional[RunnableConfig] = None,
5352 **kwargs: Optional[Any],
5353 ) -> Output:
-> 5354 return self.bound.invoke(
5355 input,
5356 self._merge_configs(config),
5357 **{**self.kwargs, **kwargs},
5358 )

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langchain_core\language_models\chat_models.py:286, in BaseChatModel.invoke(self, input, config, stop, **kwargs)
275 def invoke(
276 self,
277 input: LanguageModelInput,
(...)
281 **kwargs: Any,
282 ) -> BaseMessage:
283 config = ensure_config(config)
284 return cast(
285 ChatGeneration,
--> 286 self.generate_prompt(
287 [self._convert_input(input)],
288 stop=stop,
289 callbacks=config.get("callbacks"),
290 tags=config.get("tags"),
291 metadata=config.get("metadata"),
292 run_name=config.get("run_name"),
293 run_id=config.pop("run_id", None),
294 **kwargs,
295 ).generations[0][0],
296 ).message

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langchain_core\language_models\chat_models.py:786, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, **kwargs)
778 def generate_prompt(
779 self,
780 prompts: list[PromptValue],
(...)
783 **kwargs: Any,
784 ) -> LLMResult:
785 prompt_messages = [p.to_messages() for p in prompts]
--> 786 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langchain_core\language_models\chat_models.py:643, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
641 if run_managers:
642 run_managers[i].on_llm_error(e, response=LLMResult(generations=[]))
--> 643 raise e
644 flattened_outputs = [
645 LLMResult(generations=[res.generations], llm_output=res.llm_output) # type: ignore[list-item]
646 for res in results
647 ]
648 llm_output = self._combine_llm_outputs([res.llm_output for res in results])

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langchain_core\language_models\chat_models.py:633, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
630 for i, m in enumerate(messages):
631 try:
632 results.append(
--> 633 self._generate_with_cache(
634 m,
635 stop=stop,
636 run_manager=run_managers[i] if run_managers else None,
637 **kwargs,
638 )
639 )
640 except BaseException as e:
641 if run_managers:

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langchain_core\language_models\chat_models.py:851, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, **kwargs)
849 else:
850 if inspect.signature(self._generate).parameters.get("run_manager"):
--> 851 result = self._generate(
852 messages, stop=stop, run_manager=run_manager, **kwargs
853 )
854 else:
855 result = self._generate(messages, stop=stop, **kwargs)

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\langchain_groq\chat_models.py:473, in ChatGroq._generate(self, messages, stop, run_manager, **kwargs)
468 message_dicts, params = self._create_message_dicts(messages, stop)
469 params = {
470 **params,
471 **kwargs,
472 }
--> 473 response = self.client.create(messages=message_dicts, **params)
474 return self._create_chat_result(response)

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\groq\resources\chat\completions.py:287, in Completions.create(self, messages, model, frequency_penalty, function_call, functions, logit_bias, logprobs, max_tokens, n, parallel_tool_calls, presence_penalty, response_format, seed, stop, stream, temperature, tool_choice, tools, top_logprobs, top_p, user, extra_headers, extra_query, extra_body, timeout)
146 def create(
147 self,
148 *,
(...)
175 timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,
176 ) -> ChatCompletion | Stream[ChatCompletionChunk]:
177 """
178 Creates a model response for the given chat conversation.
179
(...)
285 timeout: Override the client-level default timeout for this request, in seconds
286 """
--> 287 return self._post(
288 "/openai/v1/chat/completions",
289 body=maybe_transform(
290 {
291 "messages": messages,
292 "model": model,
293 "frequency_penalty": frequency_penalty,
294 "function_call": function_call,
295 "functions": functions,
296 "logit_bias": logit_bias,
297 "logprobs": logprobs,
298 "max_tokens": max_tokens,
299 "n": n,
300 "parallel_tool_calls": parallel_tool_calls,
301 "presence_penalty": presence_penalty,
302 "response_format": response_format,
303 "seed": seed,
304 "stop": stop,
305 "stream": stream,
306 "temperature": temperature,
307 "tool_choice": tool_choice,
308 "tools": tools,
309 "top_logprobs": top_logprobs,
310 "top_p": top_p,
311 "user": user,
312 },
313 completion_create_params.CompletionCreateParams,
314 ),
315 options=make_request_options(
316 extra_headers=extra_headers, extra_query=extra_query, extra_body=extra_body, timeout=timeout
317 ),
318 cast_to=ChatCompletion,
319 stream=stream or False,
320 stream_cls=Stream[ChatCompletionChunk],
321 )

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\groq\_base_client.py:1244, in SyncAPIClient.post(self, path, cast_to, body, options, files, stream, stream_cls)
1230 def post(
1231 self,
1232 path: str,
(...)
1239 stream_cls: type[_StreamT] | None = None,
1240 ) -> ResponseT | _StreamT:
1241 opts = FinalRequestOptions.construct(
1242 method="post", url=path, json_data=body, files=to_httpx_files(files), **options
1243 )
-> 1244 return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\groq\_base_client.py:936, in SyncAPIClient.request(self, cast_to, options, remaining_retries, stream, stream_cls)
927 def request(
928 self,
929 cast_to: Type[ResponseT],
(...)
934 stream_cls: type[_StreamT] | None = None,
935 ) -> ResponseT | _StreamT:
--> 936 return self._request(
937 cast_to=cast_to,
938 options=options,
939 stream=stream,
940 stream_cls=stream_cls,
941 remaining_retries=remaining_retries,
942 )

File c:\Users\VaishakBhuvanMR\Desktop\Comparechat\compare\Lib\site-packages\groq\_base_client.py:1039, in SyncAPIClient._request(self, cast_to, options, remaining_retries, stream, stream_cls)
1036 err.response.read()
1038 log.debug("Re-raising status error")
-> 1039 raise self._make_status_error_from_response(err.response) from None
1041 return self._process_response(
1042 cast_to=cast_to,
1043 options=options,
(...)
1047 retries_taken=options.get_max_retries(self.max_retries) - retries,
1048 )

BadRequestError: Error code: 400 - {'error': {'message': "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.", 'type': 'invalid_request_error', 'code': 'tool_use_failed', 'failed_generation': '{"tool\\_call":{"id":"pending","type":"function","function":{"name":"sql\\_db\\_schema"},"parameters":{"table\\_names":"Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track"}}}'}}"
}"

1 reply

ashen007 Jan 9, 2025

this is common case with groq failed to call the correct tool, what you can do is handle the error or switch to better model

bitadmiral · 2024-11-14T05:19:09Z

bitadmiral
Nov 14, 2024

I was getting the same error about calling the wrong tool so I rewrote the graph slightly to make it easier to debug. (Relevant code is shown as the last code block in this message.) While looking at the message generated by the query generator chain that is bound with the SubmitFinalAnswer tool, I'm starting to wonder if there is a problem with the way LangChain handles tool bindings. The chat model is clearly calling a tool that is not bound to it. Btw, I'm using Anthropic model Claude 3.5 Sonnet v2.

I have my breakpoint set at the statement right after the line where the query-generating chain is invoked, which is the statement starting with message = query_generator.invoke(...)
At this breakpoint I can see the contents of both of these variables at the same time:

query_generator (RunnableSequence)
message (AIMessage)

I can tell there was only one tool bound to the chat model in the chain stored in query_generator by printing the kwargs field on the second step in the chain, as follows:

print((query_generator.steps[1]).kwargs)

Output:

{'tools': [{'name': 'SubmitFinalAnswer', 'description': 'Submit answer in English prose to the user based on query results.', 'input_schema': {'properties': {'final_answer': {'description': 'The final answer to the user', 'type': 'string'}}, 'required': ['final_answer'], 'type': 'object'}}]}

However, the message that was output from query_generator.invoke() contains a totally different tool call which is not mentioned anywhere in my code:

AIMessage(content='', additional_kwargs={'usage': {'prompt_tokens': 4410, 'completion_tokens': 289, 'total_tokens': 4699}, 'stop_reason': 'tool_use', 'model_id': 'anthropic.claude-3-5-sonnet-20241022-v2:0'}, response_metadata={'usage': {'prompt_tokens': 4410, 'completion_tokens': 289, 'total_tokens': 4699}, 'stop_reason': 'tool_use', 'model_id': 'anthropic.claude-3-5-sonnet-20241022-v2:0'}, id='run-657fcfb4-55ac-49a9-bdc9-21cda61bcb1f-0', tool_calls=[{'name': 'sql_execute_query', 'args': {'sql_query': 'SELECT \n    zip_code,\n    gender_female_pop,\n    gender_male_pop,\n    CAST(gender_female_pop AS DECIMAL(16,2)) / NULLIF(gender_male_pop, 0) as female_to_male_ratio\nFROM sd_community_demographics\nWHERE gender_male_pop > 0\nORDER BY female_to_male_ratio DESC\nLIMIT 2;'}, 'id': 'toolu_bdrk_016g8qpU5LfcHZQRFUJ8z3sa', 'type': 'tool_call'}], usage_metadata={'input_tokens': 4410, 'output_tokens': 289, 'total_tokens': 4699})

To make it easier to see, here is the tool_calls property of that AIMessage:

pprint.pprint(message.tool_calls)

Output:

[{'args': {'sql_query': 'SELECT \n'
                        '    zip_code,\n'
                        '    gender_female_pop,\n'
                        '    gender_male_pop,\n'
                        '    CAST(gender_female_pop AS DECIMAL(16,2)) / '
                        'NULLIF(gender_male_pop, 0) as female_to_male_ratio\n'
                        'FROM sd_community_demographics\n'
                        'WHERE gender_male_pop > 0\n'
                        'ORDER BY female_to_male_ratio DESC\n'
                        'LIMIT 2;'},
  'id': 'toolu_bdrk_016g8qpU5LfcHZQRFUJ8z3sa',
  'name': 'sql_execute_query',
  'type': 'tool_call'}]

How is it possible this tool 'sql_execute_query' is being called? How did the chat model even find out about the structure of this tool?

    # Configure a chat model instance with ability to call a tool that gets the schema of a table.
    model_get_schema = ChatBedrock(model_id=awsutil.default_chat_model_id, client=awsutil.get_client('bedrock-runtime')).bind_tools([get_schema_tool])
    # Configure a chat model instance with ability to call a tool that checks and runs a query.
    chat_model_with_query_tool = ChatBedrock(model_id=awsutil.default_chat_model_id, client=awsutil.get_client('bedrock-runtime')).bind_tools([db_query_tool], tool_choice="db_query_tool")
    # Assemble query-checking chain to be used as a Tool
    query_checker = p.query_check_prompt | chat_model_with_query_tool
    #test_query_checker(query_checker)

    def model_check_query(state: ChatQueryState) -> dict[str, list[AIMessage]]:
        """
        Use this tool to double-check the correctness of your query and execute it when deemed correct.
        """
        return {
            "messages": [
                query_checker.invoke(
                    {
                        "messages": [state["messages"][-1]],
                        "database_service": "AWS Athena",
                        "dialect": "Trino"
                    }
                )
            ]
        }
    
    # Assemble query-generating chain.
    chat_model_with_end_tool = ChatBedrock(model_id=awsutil.default_chat_model_id, client=awsutil.get_client('bedrock-runtime')).bind_tools(
        [tool.SubmitFinalAnswer]
    )
    query_generator = p.query_gen_prompt | chat_model_with_end_tool

    def query_gen_node(state: ChatQueryState):
        message = query_generator.invoke(
            {
                'messages': state_messages_excluding_type(state, SystemMessage)['messages'], 
                'database_service': "AWS Athena", 
                'dialect': "Trino"
            }
        )
        # Sometimes the LLM will hallucinate and call the wrong tool. We need to catch this and return an error message.
        tool_messages = []  # <<<BREAKPOINT SET HERE>>>
        if message.tool_calls:
            for tc in message.tool_calls:
                if tc["name"] != "SubmitFinalAnswer":
                    tool_messages.append(
                        ToolMessage(
                            content=f"Error: The wrong tool was called: {tc['name']}. Please fix your mistakes. Remember to only call SubmitFinalAnswer to submit the final answer. Generated queries should be outputted WITHOUT a tool call.",
                            tool_call_id=tc["id"],
                        )
                    )
        else:
            tool_messages = []
        return {"messages": [message] + tool_messages}


    workflow.add_node("first_tool_call", first_tool_call)
    # Add nodes for the first two tools
    workflow.add_node("list_tables_tool", tool_util.create_tool_node_with_fallback([list_tables_tool]))
    # Add a node for a model to choose the relevant tables based on the question and available tables
    workflow.add_node(
        "model_get_schema",
        lambda state: {
            "messages": [model_get_schema.invoke(state["messages"])]
        },
    )
    # A node to run the get_schema_tool if requested.
    workflow.add_node("get_schema_tool", tool_util.create_tool_node_with_fallback([get_schema_tool]))
    workflow.add_node("query_gen", query_gen_node)
    # # Add a node for the model to check the query before executing it
    # workflow.add_node("correct_query", model_check_query)
    # # Add node for executing the query
    # workflow.add_node("execute_query", tool_util.create_tool_node_with_fallback([db_query_tool]))
    # Specify the edges between the nodes
    workflow.add_edge(START, "first_tool_call")
    workflow.add_edge("first_tool_call", "list_tables_tool")
    workflow.add_edge("list_tables_tool", "model_get_schema")
    workflow.add_edge("model_get_schema", "get_schema_tool")
    workflow.add_edge("get_schema_tool", "query_gen")
    # workflow.add_conditional_edges(
    #     "query_gen",
    #     should_continue,
    # )
    # workflow.add_edge("correct_query", "execute_query")
    # workflow.add_edge("execute_query", "query_gen")

    # Compile the workflow into a runnable
    compiled_graph: "CompiledStateGraph" = workflow.compile()

0 replies

jjacosta37 · 2024-12-26T22:21:53Z

jjacosta37
Dec 26, 2024

One issue I encountered with the proposed approach is that the query_gen was sometimes calling the SubmitFinalAnswer directly without querying the database. IMO it is not a good design to have the SubmitFinalAnswer in the same query_gen node. As other people suggested it seems more natural to have it after the db_query_tool.

Here is my implementation using a simple ReAct Agent design to call the tools. I realized the SQLDatabaseToolkit(db=db, llm=ChatOpenAI(model="gpt-4o")).get_tools() already has tools to revise the query and to execute it.

This could be a good start for simpler use cases.

from langchain_community.agent_toolkits import SQLDatabaseToolkit
from dotenv import load_dotenv
from langchain_community.utilities import SQLDatabase
from sqlalchemy import create_engine
from langchain_openai import ChatOpenAI

from langgraph.prebuilt import ToolNode, tools_condition

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

from langchain_openai import ChatOpenAI

from typing_extensions import TypedDict
from typing import Annotated

from langgraph.graph import END, StateGraph, START
from langgraph.graph.message import add_messages


# Load environment variables
load_dotenv()

db_file = "chinook.db"
engine = create_engine(f"sqlite:///{db_file}")
db = SQLDatabase(engine=engine)


# SQL Manipulation Tools
toolkit = SQLDatabaseToolkit(db=db, llm=ChatOpenAI(model="gpt-4o"))
sql_db_toolkit_tools = toolkit.get_tools()

query_gen_system = """

You are an agent designed to interact with SQL database.
Given an input question, create a syntactically correct SQLite query to run, then look at the results of the query and return the answer.

<instructions>
1. Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 10 results.
2. You can order the results by a relevant column to return the most interesting examples in the database. Never query for all the columns from a specific table, only ask for the relevant columns given the question.
3. You have access to tools for interacting with the database. Use your tools to fetch the database scheema, so that you can generate the query based on the schema. Only use the information returned by the tools to construct your final answer.
4. You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.
5. Once you are able to provide an answer from the data fectched from the database, don't call any tools again.
</instructions>

<Restrictions>
1. DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.
2. DO NOT MAKE UP ANSWER.
</Restrictions>

"""

query_gen_prompt = ChatPromptTemplate.from_messages(
    [
        ("system",
            query_gen_system
         ),
        MessagesPlaceholder(
            variable_name="messages"),
    ]
)
query_gen_model = query_gen_prompt | ChatOpenAI(
    model="gpt-4o", temperature=0).bind_tools(tools=sql_db_toolkit_tools)


class State(TypedDict):
    messages: Annotated[list, add_messages]


graph_builder = StateGraph(State)


def query_gen_node(state: State):
    return {"messages": [query_gen_model.invoke(state["messages"])]}


graph_builder.add_node("query_gen", query_gen_node)
query_gen_tools_node = ToolNode(tools=sql_db_toolkit_tools)
graph_builder.add_node("query_gen_tools", query_gen_tools_node)

graph_builder.add_conditional_edges(
    "query_gen",
    tools_condition,
    {"tools": "query_gen_tools", END: END},

)

graph_builder.add_edge("query_gen_tools", "query_gen")
graph_builder.set_entry_point("query_gen")
graph = graph_builder.compile()

# print(graph.get_graph().draw_mermaid())

initial_query = str("Which sales agent made the most in sales in 2009?")


for event in graph.stream(
    {"messages": [("user", initial_query)]}
):
    print(event)

Results:

{'query_gen': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_gPPWfL0zzICLGYFkIM69uh8p', 'function': {'arguments': '{}', 'name': 'sql_db_list_tables'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 13, 'prompt_tokens': 589, 'total_tokens': 602, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_d28bcae782', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-0b7eb74f-b378-45af-8311-7a5a596c61de-0', tool_calls=[{'name': 'sql_db_list_tables', 'args': {}, 'id': 'call_gPPWfL0zzICLGYFkIM69uh8p', 'type': 'tool_call'}], usage_metadata={'input_tokens': 589, 'output_tokens': 13, 'total_tokens': 602, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}
{'query_gen_tools': {'messages': [ToolMessage(content='albums, artists, customers, employees, genres, invoice_items, invoices, media_types, playlist_track, playlists, tracks', name='sql_db_list_tables', id='087ee58e-90a5-4610-972d-e5f26257d19b', tool_call_id='call_gPPWfL0zzICLGYFkIM69uh8p')]}}
{'query_gen': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_gwyxBbrqf4S48QgIBEYfwps6', 'function': {'arguments': '{"table_names":"invoices, employees"}', 'name': 'sql_db_schema'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 635, 'total_tokens': 655, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_d28bcae782', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-740f4368-1959-4480-9da9-3b2b950444ef-0', tool_calls=[{'name': 'sql_db_schema', 'args': {'table_names': 'invoices, employees'}, 'id': 'call_gwyxBbrqf4S48QgIBEYfwps6', 'type': 'tool_call'}], usage_metadata={'input_tokens': 635, 'output_tokens': 20, 'total_tokens': 655, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}
{'query_gen_tools': {'messages': [ToolMessage(content='\nCREATE TABLE employees (\n\t"EmployeeId" INTEGER NOT NULL, \n\t"LastName" NVARCHAR(20) NOT NULL, \n\t"FirstName" NVARCHAR(20) NOT NULL, \n\t"Title" NVARCHAR(30), \n\t"ReportsTo" INTEGER, \n\t"BirthDate" DATETIME, \n\t"HireDate" DATETIME, \n\t"Address" NVARCHAR(70), \n\t"City" NVARCHAR(40), \n\t"State" NVARCHAR(40), \n\t"Country" NVARCHAR(40), \n\t"PostalCode" NVARCHAR(10), \n\t"Phone" NVARCHAR(24), \n\t"Fax" NVARCHAR(24), \n\t"Email" NVARCHAR(60), \n\tPRIMARY KEY ("EmployeeId"), \n\tFOREIGN KEY("ReportsTo") REFERENCES employees ("EmployeeId")\n)\n\n/*\n3 rows from employees table:\nEmployeeId\tLastName\tFirstName\tTitle\tReportsTo\tBirthDate\tHireDate\tAddress\tCity\tState\tCountry\tPostalCode\tPhone\tFax\tEmail\n1\tAdams\tAndrew\tGeneral Manager\tNone\t1962-02-18 00:00:00\t2002-08-14 00:00:00\t11120 Jasper Ave NW\tEdmonton\tAB\tCanada\tT5K 2N1\t+1 (780) 428-9482\t+1 (780) 428-3457\tandrew@chinookcorp.com\n2\tEdwards\tNancy\tSales Manager\t1\t1958-12-08 00:00:00\t2002-05-01 00:00:00\t825 8 Ave SW\tCalgary\tAB\tCanada\tT2P 2T3\t+1 (403) 262-3443\t+1 (403) 262-3322\tnancy@chinookcorp.com\n3\tPeacock\tJane\tSales Support Agent\t2\t1973-08-29 00:00:00\t2002-04-01 00:00:00\t1111 6 Ave SW\tCalgary\tAB\tCanada\tT2P 5M5\t+1 (403) 262-3443\t+1 (403) 262-6712\tjane@chinookcorp.com\n*/\n\n\nCREATE TABLE invoices (\n\t"InvoiceId" INTEGER NOT NULL, \n\t"CustomerId" INTEGER NOT NULL, \n\t"InvoiceDate" DATETIME NOT NULL, \n\t"BillingAddress" NVARCHAR(70), \n\t"BillingCity" NVARCHAR(40), \n\t"BillingState" NVARCHAR(40), \n\t"BillingCountry" NVARCHAR(40), \n\t"BillingPostalCode" NVARCHAR(10), \n\t"Total" NUMERIC(10, 2) NOT NULL, \n\tPRIMARY KEY ("InvoiceId"), \n\tFOREIGN KEY("CustomerId") REFERENCES customers ("CustomerId")\n)\n\n/*\n3 rows from invoices table:\nInvoiceId\tCustomerId\tInvoiceDate\tBillingAddress\tBillingCity\tBillingState\tBillingCountry\tBillingPostalCode\tTotal\n1\t2\t2009-01-01 00:00:00\tTheodor-Heuss-Straße 34\tStuttgart\tNone\tGermany\t70174\t1.98\n2\t4\t2009-01-02 00:00:00\tUllevålsveien 14\tOslo\tNone\tNorway\t0171\t3.96\n3\t8\t2009-01-03 00:00:00\tGrétrystraat 63\tBrussels\tNone\tBelgium\t1000\t5.94\n*/', name='sql_db_schema', id='15e6c625-7859-4a62-94fc-1beabaa96de7', tool_call_id='call_gwyxBbrqf4S48QgIBEYfwps6')]}}
{'query_gen': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Gd6EN4zZnkvrsMTwPdvG0tl8', 'function': {'arguments': '{"table_names":"customers"}', 'name': 'sql_db_schema'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 1437, 'total_tokens': 1454, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_d28bcae782', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-5316d5c6-f32b-4f28-991d-1ab93f10ea98-0', tool_calls=[{'name': 'sql_db_schema', 'args': {'table_names': 'customers'}, 'id': 'call_Gd6EN4zZnkvrsMTwPdvG0tl8', 'type': 'tool_call'}], usage_metadata={'input_tokens': 1437, 'output_tokens': 17, 'total_tokens': 1454, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}
{'query_gen_tools': {'messages': [ToolMessage(content='\nCREATE TABLE customers (\n\t"CustomerId" INTEGER NOT NULL, \n\t"FirstName" NVARCHAR(40) NOT NULL, \n\t"LastName" NVARCHAR(20) NOT NULL, \n\t"Company" NVARCHAR(80), \n\t"Address" NVARCHAR(70), \n\t"City" NVARCHAR(40), \n\t"State" NVARCHAR(40), \n\t"Country" NVARCHAR(40), \n\t"PostalCode" NVARCHAR(10), \n\t"Phone" NVARCHAR(24), \n\t"Fax" NVARCHAR(24), \n\t"Email" NVARCHAR(60) NOT NULL, \n\t"SupportRepId" INTEGER, \n\tPRIMARY KEY ("CustomerId"), \n\tFOREIGN KEY("SupportRepId") REFERENCES employees ("EmployeeId")\n)\n\n/*\n3 rows from customers table:\nCustomerId\tFirstName\tLastName\tCompany\tAddress\tCity\tState\tCountry\tPostalCode\tPhone\tFax\tEmail\tSupportRepId\n1\tLuís\tGonçalves\tEmbraer - Empresa Brasileira de Aeronáutica S.A.\tAv. Brigadeiro Faria Lima, 2170\tSão José dos Campos\tSP\tBrazil\t12227-000\t+55 (12) 3923-5555\t+55 (12) 3923-5566\tluisg@embraer.com.br\t3\n2\tLeonie\tKöhler\tNone\tTheodor-Heuss-Straße 34\tStuttgart\tNone\tGermany\t70174\t+49 0711 2842222\tNone\tleonekohler@surfeu.de\t5\n3\tFrançois\tTremblay\tNone\t1498 rue Bélanger\tMontréal\tQC\tCanada\tH2G 1A7\t+1 (514) 721-4711\tNone\tftremblay@gmail.com\t3\n*/', name='sql_db_schema', id='648a7516-bab0-4bfb-9b32-1f5b0fa0f3c3', tool_call_id='call_Gd6EN4zZnkvrsMTwPdvG0tl8')]}}
{'query_gen': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_4VdWPaEy9qCaGrVRf4Iv2Fw5', 'function': {'arguments': '{"query":"SELECT e.FirstName, e.LastName, SUM(i.Total) as TotalSales\\nFROM employees e\\nJOIN customers c ON e.EmployeeId = c.SupportRepId\\nJOIN invoices i ON c.CustomerId = i.CustomerId\\nWHERE strftime(\'%Y\', i.InvoiceDate) = \'2009\'\\nGROUP BY e.EmployeeId\\nORDER BY TotalSales DESC\\nLIMIT 1;"}', 'name': 'sql_db_query_checker'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 94, 'prompt_tokens': 1860, 'total_tokens': 1954, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 1280}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_d28bcae782', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-a40b3bda-20e1-436f-812a-3e890bc87a83-0', tool_calls=[{'name': 'sql_db_query_checker', 'args': {'query': "SELECT e.FirstName, e.LastName, SUM(i.Total) as TotalSales\nFROM employees e\nJOIN customers c ON e.EmployeeId = c.SupportRepId\nJOIN invoices i ON c.CustomerId = i.CustomerId\nWHERE strftime('%Y', i.InvoiceDate) = '2009'\nGROUP BY e.EmployeeId\nORDER BY TotalSales DESC\nLIMIT 1;"}, 'id': 'call_4VdWPaEy9qCaGrVRf4Iv2Fw5', 'type': 'tool_call'}], usage_metadata={'input_tokens': 1860, 'output_tokens': 94, 'total_tokens': 1954, 'input_token_details': {'audio': 0, 'cache_read': 1280}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}
{'query_gen_tools': {'messages': [ToolMessage(content="```sql\nSELECT e.FirstName, e.LastName, SUM(i.Total) as TotalSales\nFROM employees e\nJOIN customers c ON e.EmployeeId = c.SupportRepId\nJOIN invoices i ON c.CustomerId = i.CustomerId\nWHERE strftime('%Y', i.InvoiceDate) = '2009'\nGROUP BY e.EmployeeId\nORDER BY TotalSales DESC\nLIMIT 1;\n```", name='sql_db_query_checker', id='c92385b7-0dda-459e-b7f3-7d427bbb5bf3', tool_call_id='call_4VdWPaEy9qCaGrVRf4Iv2Fw5')]}}
{'query_gen': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_dlmoFRh6XlnEzAy0am8HkeFz', 'function': {'arguments': '{"query":"SELECT e.FirstName, e.LastName, SUM(i.Total) as TotalSales\\nFROM employees e\\nJOIN customers c ON e.EmployeeId = c.SupportRepId\\nJOIN invoices i ON c.CustomerId = i.CustomerId\\nWHERE strftime(\'%Y\', i.InvoiceDate) = \'2009\'\\nGROUP BY e.EmployeeId\\nORDER BY TotalSales DESC\\nLIMIT 1;"}', 'name': 'sql_db_query'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 93, 'prompt_tokens': 2044, 'total_tokens': 2137, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 1792}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_d28bcae782', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-6e1eba72-f447-400f-8b57-192b55c9855d-0', tool_calls=[{'name': 'sql_db_query', 'args': {'query': "SELECT e.FirstName, e.LastName, SUM(i.Total) as TotalSales\nFROM employees e\nJOIN customers c ON e.EmployeeId = c.SupportRepId\nJOIN invoices i ON c.CustomerId = i.CustomerId\nWHERE strftime('%Y', i.InvoiceDate) = '2009'\nGROUP BY e.EmployeeId\nORDER BY TotalSales DESC\nLIMIT 1;"}, 'id': 'call_dlmoFRh6XlnEzAy0am8HkeFz', 'type': 'tool_call'}], usage_metadata={'input_tokens': 2044, 'output_tokens': 93, 'total_tokens': 2137, 'input_token_details': {'audio': 0, 'cache_read': 1792}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}
{'query_gen_tools': {'messages': [ToolMessage(content="[('Steve', 'Johnson', 164.33999999999997)]", name='sql_db_query', id='63a509c3-c9ab-47f7-ae36-7ab7885125c8', tool_call_id='call_dlmoFRh6XlnEzAy0am8HkeFz')]}}
{'query_gen': {'messages': [AIMessage(content='The sales agent who made the most in sales in 2009 was Steve Johnson, with total sales amounting to 164.34.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 2161, 'total_tokens': 2191, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 2048}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_d28bcae782', 'finish_reason': 'stop', 'logprobs': None}, id='run-71b76769-35ad-4470-819f-ec3ddae0fd60-0', usage_metadata={'input_tokens': 2161, 'output_tokens': 30, 'total_tokens': 2191, 'input_token_details': {'audio': 0, 'cache_read': 2048}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}}

0 replies

fnusatvik07 · 2025-01-03T07:37:30Z

fnusatvik07
Jan 3, 2025 — with giscus

If a database has 100 tables then this approach isnt the right one. You dont want to pass on details of 100 tables as prompt. Also we are simply relying on table names or columnnames. thats not always the case, in real scenario. your names can vary. So instead of using these pre built tools. use agents wiht rag to fetch the relevant things

0 replies

YassinNouh21 · 2025-01-09T00:18:19Z

YassinNouh21
Jan 9, 2025

@fnusatvik07 what do u suggest then ?

1 reply

ashen007 Jan 9, 2025

you can add few extra nodes to the graph to decide which tables to use for particular query and load schema description only for that tables into sql agent graph. yet we have to give description to router to get better decision on selecting tables. i'm not sure what type of rag @fnusatvik07 mentioned but if we need to get data from sql database we have to use sql someway or another i think 🤷🏻‍♂️

juhaknu · 2025-01-09T12:47:57Z

juhaknu
Jan 9, 2025 — with giscus

Great example, thanks. It worked fine with OpenAi, but when I switched to Ollama I did not get it working. Seems that it's not for some reason doing the schema call at all. Not sure what is the best way to debug this. Do I need to have different prompts for Ollama. (llama3.2).

I only have one table, promo_prices, which contains column store. Here's what LangSmith says to me at that point:
model_get_schema:
Input
HUMAN
Which store has most promo items?

AI
sql_db_list_tables tool_abcd123

{}

TOOL
sql_db_list_tables tool_abcd123
promo_prices

Rendered Output
AI
Based on the data, it appears that Walmart has the most promo items. According to the data, Walmart offers a total of 245 promo items, followed

0 replies

MukteshR098 · 2025-01-18T07:59:12Z

MukteshR098
Jan 18, 2025 — with giscus

Guys, I want to add memory to this agent, but I am not able to. Can you please suggest to me the way to move forward? I am Using OPENAI API KEY for my model.

0 replies

gmoe1994 · 2025-02-12T11:24:38Z

gmoe1994
Feb 12, 2025

Can someone please explain how the SubmitFinalAnswer tool works? I don't understand how it can be a tool when it's not a function that returns anything:

`
class SubmitFinalAnswer(BaseModel):
"""Submit the final answer to the user based on the query results."""

final_answer: str = Field(..., description="The final answer to the user")

`

0 replies

edmald · 2025-02-12T20:00:30Z

edmald
Feb 12, 2025 — with giscus

Thanks for the example, I am working with something similar:
On implementing a React agent with database tools. Here's my specific scenario:
I have a React agent with tools that fetch data from a database
I want to send only a summary/statistics of this data to the LLM (not the full dataset)
The full data needs to be displayed to the user directly, bypassing the LLM
I need help figuring out how to structure this in LangGraph - specifically how to:
Generate and send just the statistics to the LLM for context
Show the complete data to the user without passing it through the LLM
how to not to store the whole result of the query in the LLM because it pass throw the LLM?

0 replies

devroopsaha744 · 2025-02-15T12:02:43Z

devroopsaha744
Feb 15, 2025 — with giscus

I am getting this error

BadRequestError Traceback (most recent call last)
in <cell line: 0>()
----> 1 messages = app.invoke(
2 {"messages": [("user", "What is the total expediture till now?")]}
3 )
4 json_str = messages["messages"][-1].tool_calls[0]["args"]["final_answer"]
5 print(json_str)

17 frames
/usr/local/lib/python3.11/dist-packages/groq/_base_client.py in _request(self, cast_to, options, retries_taken, stream, stream_cls)
1059
1060 log.debug("Re-raising status error")
-> 1061 raise self._make_status_error_from_response(err.response) from None
1062
1063 return self._process_response(

BadRequestError: Error code: 400 - {'error': {'message': "Failed to call a function. Please adjust your prompt. See 'failed_generation' for more details.", 'type': 'invalid_request_error', 'code': 'tool_use_failed', 'failed_generation': 'db_query_tool{"query": "SELECT SUM(amount) FROM transactions"}'}}

I am using Groq llama-3.3-70b-versatile (supports tool use) https://console.groq.com/docs/tool-use

Please help

0 replies

langgraph/tutorials/sql-agent/ #879

giscus[bot] bot Jun 28, 2024

langgraph/tutorials/sql-agent/

Replies: 22 comments · 51 replies

MatrixChess Jun 28, 2024 — with giscus

3x10RaiseTo8 Jun 29, 2024 — with giscus

matamorosjuan1 Jul 3, 2024 — with giscus

EstephaniaCalvoC Oct 5, 2024 — with giscus

waics Dec 4, 2024 — with giscus

ashen007 Jul 12, 2024 — with giscus

patrickcteng Sep 3, 2024 — with giscus

kuguadawang12138 Sep 24, 2024 — with giscus

EstephaniaCalvoC Oct 6, 2024 — with giscus

SleepingSteven Sep 3, 2024 — with giscus

DjangoPeng Sep 8, 2024 — with giscus

patrikfajnar Sep 10, 2024 — with giscus

DjangoPeng Sep 11, 2024 — with giscus

gtnpromtior Oct 22, 2024 — with giscus

Adhikasurya Sep 25, 2024 — with giscus

Adhikasurya Sep 25, 2024 — with giscus

Adhikasurya Sep 25, 2024 — with giscus

Adhikasurya Sep 30, 2024 — with giscus

Steps:

gtnpromtior Oct 23, 2024 — with giscus

gtnpromtior Oct 23, 2024 — with giscus

gtnpromtior Oct 25, 2024 — with giscus

gtnpromtior Nov 4, 2024 — with giscus

hrithikkoduri Oct 31, 2024 — with giscus

gtnpromtior Nov 3, 2024 — with giscus

codearb Nov 3, 2024 — with giscus

fnusatvik07 Jan 3, 2025 — with giscus

juhaknu Jan 9, 2025 — with giscus

MukteshR098 Jan 18, 2025 — with giscus

giscus[bot]
bot Jun 28, 2024

Replies: 22 comments 51 replies

MatrixChess
Jun 28, 2024 — with giscus

3x10RaiseTo8
Jun 29, 2024 — with giscus

ashen007
Jul 12, 2024 — with giscus

SleepingSteven
Sep 3, 2024 — with giscus

DjangoPeng
Sep 8, 2024 — with giscus

Adhikasurya
Sep 25, 2024 — with giscus

gtnpromtior
Oct 23, 2024 — with giscus

gtnpromtior
Oct 25, 2024 — with giscus

hrithikkoduri
Oct 31, 2024 — with giscus

codearb
Nov 3, 2024 — with giscus

fnusatvik07
Jan 3, 2025 — with giscus

juhaknu
Jan 9, 2025 — with giscus

MukteshR098
Jan 18, 2025 — with giscus