Human Editing

Human-computer interaction is being revolutionized by advancements in "human editing," encompassing the ability to generate and modify various data modalities (images, audio, text, 3D models) using natural language instructions or other forms of user input. Current research heavily utilizes diffusion models and large language models (LLMs), often integrated within multimodal frameworks, to achieve precise and flexible control over the editing process, addressing challenges like hallucination and ambiguity. This field is significant for its potential to improve accessibility in creative fields, enhance the efficiency of content creation, and advance our understanding of how humans interact with and interpret AI-generated content.

Papers