Text to Image Generation Model

Text-to-image generation models aim to create realistic images from textual descriptions, focusing on improving image quality, accuracy, and user control. Current research emphasizes enhancing model faithfulness to input text, addressing issues like image hallucination and bias, and improving controllability through techniques like sketch guidance and layout conditioning, often leveraging diffusion models and large language models. These advancements have significant implications for accessible communication, creative content generation, and various applications requiring image synthesis from textual information, while also raising concerns about potential misuse and the need for robust evaluation metrics.

Papers