Instruction Inversion

Instruction inversion focuses on translating complex or nuanced editing instructions, whether textual or visual, into formats readily usable by AI models, particularly for image manipulation and large language model (LLM) control. Current research emphasizes developing methods to extract and optimize these instructions from example image pairs or automatically generated datasets of toxic instructions for ethical training. This work is significant for improving the robustness and safety of AI systems by enabling more precise control over their behavior and mitigating vulnerabilities to adversarial attacks through indirect instruction injection.

Papers