Skip to content

Reallm-Labs/InfiGUIAgent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

ToRA
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

This is the repo for the paper "InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection". In this work, we develop a multimodal large language model-based GUI agent that enables enhanced task automation on computing devices. Our agent is trained through a two-stage supervised fine-tuning approach that focuses on fundamental GUI understanding skills and advanced reasoning capabilities, where we integrate hierarchical reasoning and expectation-reflection reasoning to enable native reasoning abilities in GUI interactions.

🔥 News

InfiGUIAgent

Data and code are coming soon.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published