Text to Image Model
Text-to-image models generate images from textual descriptions, aiming to achieve high fidelity, creativity, and safety. Current research focuses on improving image-text alignment, mitigating biases and safety issues (like generating harmful content or being vulnerable to jailbreaks), and enhancing model generalizability and efficiency through techniques such as diffusion models, fine-tuning strategies, and vector quantization. These advancements have significant implications for various fields, including art, design, and medical imaging, but also raise ethical concerns regarding bias, safety, and potential misuse requiring ongoing investigation and development of robust mitigation strategies.
Papers
PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models
Lingzhi Yuan, Xinfeng Li, Chejian Xu, Guanhong Tao, Xiaojun Jia, Yihao Huang, Wei Dong, Yang Liu, XiaoFeng Wang, Bo Li
Textualize Visual Prompt for Image Editing via Diffusion Bridge
Pengcheng Xu, Qingnan Fan, Fei Kou, Shuai Qin, Hong Gu, Ruoyu Zhao, Charles Ling, Boyu Wang
ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction
Zhongjie Duan, Qianyi Zhao, Cen Chen, Daoyuan Chen, Wenmeng Zhou, Yaliang Li, Yingda Chen
A Framework for Critical Evaluation of Text-to-Image Models: Integrating Art Historical Analysis, Artistic Exploration, and Critical Prompt Engineering
Amalia Foka
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training
Dongting Hu, Jierun Chen, Xijie Huang, Huseyin Coskun, Arpit Sahni, Aarush Gupta, Anujraaj Goyal, Dishani Lahiri, Rajesh Singh, Yerlan Idelbayev, Junli Cao, Yanyu Li, Kwang-Ting Cheng, S.-H. Gary Chan, Mingming Gong, Sergey Tulyakov, Anil Kag, Yanwu Xu, Jian Ren
Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG
Kavana Venkatesh, Yusuf Dalva, Ismini Lourentzou, Pinar Yanardag
StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements
Mingkun Lei, Xue Song, Beier Zhu, Hao Wang, Chi Zhang
Generate Any Scene: Evaluating and Improving Text-to-Vision Generation with Scene Graph Programming
Ziqi Gao, Weikai Huang, Jieyu Zhang, Aniruddha Kembhavi, Ranjay Krishna
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts
Ziwei Huang, Wanggui He, Quanyu Long, Yandi Wang, Haoyuan Li, Zhelun Yu, Fangxun Shu, Long Chen, Hao Jiang, Leilei Gan
BodyMetric: Evaluating the Realism of HumanBodies in Text-to-Image Generation
Nefeli Andreou, Varsha Vivek, Ying Wang, Alex Vorobiov, Tiffany Deng, Raja Bala, Larry Davis, Betty Mohler Tesch