Graphical User Interface Automation
Graphical user interface (GUI) automation aims to enable computers to autonomously perform tasks within software applications, boosting human productivity. Current research heavily utilizes large language models (LLMs) and multimodal models, often incorporating reinforcement learning and advanced planning algorithms like dynamic planning, to interpret user instructions and execute complex sequences of actions within dynamic GUI environments. This field is significant because it promises to automate tedious and repetitive tasks across diverse software, from simple mobile apps to professional design tools, but faces challenges in handling complex, visually-centric tasks and achieving high accuracy in diverse settings.
Papers
November 15, 2024
October 9, 2024
October 1, 2024
June 14, 2024
April 12, 2024
March 15, 2024
February 19, 2024
December 20, 2023
October 7, 2023
April 14, 2023
March 23, 2023