InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

This is the repo for the paper "InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection". In this work, we develop a multimodal large language model-based GUI agent that enables enhanced task automation on computing devices. Our agent is trained through a two-stage supervised fine-tuning approach that focuses on fundamental GUI understanding skills and advanced reasoning capabilities, where we integrate hierarchical reasoning and expectation-reflection reasoning to enable native reasoning abilities in GUI interactions.

🔥 News

🔥[2025/1/9] Our paper "InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection" released.
🔥[2024/12/12] Our paper "OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use" released.
[2024/4/2] Our paper "InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks" is accepted by ICML 2024.

InfiGUIAgent

We are in the process of uploading key artifacts from our paper to our 🤗 Hugging Face Collection.

Regarding the full model release, due to licensing restrictions on portions of our training data from third-party sources, we are currently sanitizing the dataset and retraining/refining the final model to ensure full compliance while maintaining performance.

Stay tuned for updates! 🔜

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
images		images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

🔥 News

InfiGUIAgent

About

Releases

Packages

Contributors 3

Reallm-Labs/InfiGUIAgent

Folders and files

Latest commit

History

Repository files navigation

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

🔥 News

InfiGUIAgent

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Packages