Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lack of Guidance on Optimizing/Finetuning ReAct Agent with Few-shot Examples #703

Closed
DanielProkhorov opened this issue Mar 23, 2024 · 2 comments

Comments

@DanielProkhorov
Copy link

The current ReAct documentation lacks clear instructions on optimizing or finetuning a ReAct agent using few-shot examples. Both the main ReAct documentation ReAct Docs and the examples documentation Examples Docs do not provide sufficient guidance in this regard. It's essential to understand that for the ReAct agent to effectively learn from few-shot examples, the complete ReAct cycle (Question, Action, Action Input, Observation) should be encapsulated within these examples.

The provided example in the documentation, such as:

qa_pair = dspy.Example(question="This is a question?", answer="This is an answer.")

does not demonstrate the correct way to optimize or finetune a ReAct agent with few-shot examples.

Could someone please provide a clear example demonstrating the correct approach to optimizing or finetuning a ReAct agent, particularly with few-shot examples? This would greatly benefit users seeking to leverage ReAct effectively.

@okhat
Copy link
Collaborator

okhat commented Mar 23, 2024

Agents have not been the priority. But they're no different to other programs:

import dspy

# Define some models.
gpt3 = dspy.OpenAI('gpt-3.5-turbo-0125', max_tokens=1000)
colbert = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
dspy.configure(lm=gpt3, rm=colbert)

# Declare the agent.
agent = dspy.ReAct("question -> answer", tools=[dspy.Retrieve(k=1)])

# Try it in zero-shot mode.
agent(question="what is 1+1?")

# See what happened in the final N prompts.
gpt3.inspect_history(n=1)

# Get some data to optimize.
from dspy.datasets import HotPotQA

dataset = HotPotQA(train_seed=1, train_size=200, eval_seed=2023, dev_size=500, test_size=0)
trainset = [x.with_inputs('question') for x in dataset.train]
devset = [x.with_inputs('question') for x in dataset.dev]

# Let's optimize
from dspy.teleprompt import BootstrapFewShotWithRandomSearch

tp = BootstrapFewShotWithRandomSearch(metric=dspy.evaluate.answer_exact_match, max_bootstrapped_demos=2, max_labeled_demos=0, num_candidate_programs=5, num_threads=8)
compiled_agent = tp.compile(agent, trainset=trainset[:50], valset=trainset[50:150])

# Now you can use the compiled_agent
compiled_agent(question="how many storeys are in the castle that David Gregory inherited?")

Hope this helps.

@DanielProkhorov
Copy link
Author

Thanks for the quick response @okhat!

Perhaps, I initially need to provide a comprehensive explanation for the ReAct agent that I intend to optimize for operation.

The objective of this agent is to navigate within a mobile phone app (or any screen in general). As such, the agent integrates the following functionalities (tools):

  • GetScreenDescription: This utilizes another computer vision (CV) model capable of providing textual descriptions of user interface elements along with their bounding boxes.

  • PerformAction: Essentially, this function executes X,Y clicks using uiautomator, taking a natural language utterance related to screen information as an input parameter.

How would the DSPy framework optimize for this specific task? The screen description and proposed action are dynamically constructed. From my understanding the LLM shall view the whole ReAct cycle as few-shot examples rather then providing a question and the answer (like you did with the HotPotQA example). Hence, it won't be sufficient.

Currently, I employ LangChain and Mixtral8x7b for this purpose, with a customized ReAct prompt and a few custom-made trajectories. Hence, I wonder, if I can switch to the DSPy framework for the exactly the reasons you mention within the FAQ section (https://dspy-docs.vercel.app/docs/faqs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants