Human Editing
Human-computer interaction is being revolutionized by advancements in "human editing," encompassing the ability to generate and modify various data modalities (images, audio, text, 3D models) using natural language instructions or other forms of user input. Current research heavily utilizes diffusion models and large language models (LLMs), often integrated within multimodal frameworks, to achieve precise and flexible control over the editing process, addressing challenges like hallucination and ambiguity. This field is significant for its potential to improve accessibility in creative fields, enhance the efficiency of content creation, and advance our understanding of how humans interact with and interpret AI-generated content.
Papers
ChatGarment: Garment Estimation, Generation and Editing via Large Language Models
Siyuan Bian, Chenghao Xu, Yuliang Xiu, Artur Grigorev, Zhen Liu, Cewu Lu, Michael J. Black, Yao Feng
Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance
Nicolas Devatine, Louis Abraham
EVLM: Self-Reflective Multimodal Reasoning for Cross-Dimensional Visual Editing
Umar Khalid, Hasan Iqbal, Azib Farooq, Nazanin Rahnavard, Jing Hua, Chen Chen Umar Khalid, Hasan Iqbal, Azib Farooq, Nazanin Rahnavard, Jing Hua, Chen Chen Umar Khalid, Hasan Iqbal, Azib Farooq, Nazanin Rahnavard, Jing Hua, Chen Chen Umar Khalid, Hasan Iqbal, Azib Farooq, Nazanin Rahnavard, Jing Hua, Chen Chen
BrushEdit: All-In-One Image Inpainting and Editing
Yaowei Li, Yuxuan Bian, Xuan Ju, Zhaoyang Zhang, Ying Shan, Qiang Xu