InferAct: Inferring Safe Actions for LLMs-Based Agents Through Preemptive Evaluation and Human Feedback
This repository implements the preemptive evaluation approach, InferAct, for LLM agents, as described in InferAct: Inferring Safe Actions for LLMs-Based Agents Through Preemptive Evaluation and Human Feedback
Abstract :A crucial requirement for deploying LLM-based agents in real-life applications is the robustness against risky or even irreversible mistakes. However, the existing research lacks a focus on preemptive evaluation of reasoning trajectories performed by LLM agents, leading to a gap in ensuring safe and reliable operations. To explore better solutions, this paper introduces InferAct, a novel approach that leverages the Theory-of-Mind capability of LLMs to proactively detect potential errors before critical actions are executed (e.g., buy-now in automatic online trading or web shopping). InferAct is also capable of integrating human feedback to prevent irreversible risks as well as enhance the actor agent's decision-making process. Experiments on three widely-used tasks demonstrate the effectiveness of InferAct. The proposed solution presents a novel approach and concrete contributions towards developing LLM agents that can be safely deployed in different environments involving critical decision-making.
Contact person: Haishuo Fang
Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.
> python -m venv .inferact
> source ./.inferact/bin/activate
> pip install -r requirements.txt
- Install openjdk in the virtual environment.
import jdk
from jdk.enums import OperatingSystem, Architecture
jdk.install('11', operating_system=OperatingSystem.LINUX)
import os
jdk_version = 'jdk-11.0.19+7' #change with your version
os.environ['JAVA_HOME'] = 'path/to/jdk'
- Configure the environment
> cd ./actor/webshop
> ./setup.sh -d all
- Download env data
Please refer to ALFWorld
export ALFWORLD_DATA="path/to/data"
We adapt code for ALFWorld
, HotPotQA
from the Reflexion repository
The Actor agent is responsible for performing tasks in environments. --run_agents
controls whether to run actor in different environments e.g. --task webshop
.
python main.py
--run_agents
--task webshop
--trial_num 0
--feedback_type nl
--num_envs 300
The evaluator evaluates the Actor's trajectory before critical actions.
python main.py
--do_eval
--task webshop
--eval_method inferact
--trial_num 0
--model_name gpt4-turbo
--feedback_type nl
--threshold 0.9
--eval_method
specifies different evaluation methods.--threshold
specifies the threshold of F1-score formulti-step evaluation
andinferact
.--do_eval
controls whether to evaluate the Actor trajectory.
After the off-track trajectory is detected by the Evaluator, the binary or NL feedback will be generated to prevent the critial action from executing.
python main.py
--do_feedback_gen
--task webshop
--eval_method inferact
--trial_num 0
--model_name gpt4-turbo
--threshold 0.9
--feedback_type nl
To run different components in a pipeline, you can use
python main.py
--run_agents
--do_eval
--do_feedback_gen
--task webshop
--model_name gpt35-turbo
--num_envs 300
--eval_method standard
--trial_num 0
--threshold 0.0
--feedback_type nl
Please use the following citation:
@article{fang2024inferact,
title={InferAct: Inferring Safe Actions for LLM-Based Agents Through Preemptive Evaluation and Human Feedback},
author={Fang, Haishuo and Zhu, Xiaodan and Gurevych, Iryna},
journal={arXiv preprint arXiv:2407.11843},
year={2024}
}
This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.