Human Instruction
Human instruction following in AI focuses on developing models capable of accurately and reliably executing complex tasks based on diverse instructions, encompassing text, images, and audio. Current research emphasizes improving model alignment through techniques like instruction tuning and response tuning, often utilizing large language models (LLMs) and diffusion transformers, and exploring novel evaluation metrics for multi-modal, multi-turn interactions. This field is crucial for advancing human-computer interaction, enabling more intuitive and effective collaboration between humans and AI systems across various domains, from robotics and manufacturing to healthcare and education.
Papers
InstructPipe: Building Visual Programming Pipelines with Human Instructions
Zhongyi Zhou, Jing Jin, Vrushank Phadnis, Xiuxiu Yuan, Jun Jiang, Xun Qian, Jingtao Zhou, Yiyi Huang, Zheng Xu, Yinda Zhang, Kristen Wright, Jason Mayes, Mark Sherwood, Johnny Lee, Alex Olwal, David Kim, Ram Iyengar, Na Li, Ruofei Du
Focus on Your Instruction: Fine-grained and Multi-instruction Image Editing by Attention Modulation
Qin Guo, Tianwei Lin
Multi-3D-Models Registration-Based Augmented Reality (AR) Instructions for Assembly
Seda Tuzun Canadinc, Wei Yan
GaussianEditor: Editing 3D Gaussians Delicately with Text Instructions
Jiemin Fang, Junjie Wang, Xiaopeng Zhang, Lingxi Xie, Qi Tian
Efficient Pre-training for Localized Instruction Generation of Videos
Anil Batra, Davide Moltisanti, Laura Sevilla-Lara, Marcus Rohrbach, Frank Keller
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
Boxin Wang, Wei Ping, Lawrence McAfee, Peng Xu, Bo Li, Mohammad Shoeybi, Bryan Catanzaro
Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models
Yuchong Sun, Che Liu, Kun Zhou, Jinwen Huang, Ruihua Song, Wayne Xin Zhao, Fuzheng Zhang, Di Zhang, Kun Gai