Text to Image
Text-to-image synthesis aims to generate realistic images from textual descriptions, leveraging advancements in deep learning, particularly diffusion models and large language models. Current research emphasizes improving image quality, addressing biases and safety concerns (e.g., generating inappropriate content), and enhancing control over generated images through techniques like prompt engineering and embedding optimization. This field is significant for its potential applications in various domains, including creative design, 3D modeling, and content creation, while also raising important ethical considerations regarding bias and responsible AI development.
Papers
TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
Maitreya Patel, Abhiram Kusumba, Sheng Cheng, Changhoon Kim, Tejas Gokhale, Chitta Baral, Yezhou Yang
Training-free Regional Prompting for Diffusion Transformers
Anthony Chen, Jianjin Xu, Wenzhao Zheng, Gaole Dai, Yida Wang, Renrui Zhang, Haofan Wang, Shanghang Zhang
HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models
Shengkai Zhang, Nianhong Jiao, Tian Li, Chaojie Yang, Chenhui Xue, Boya Niu, Jun Gao
One Prompt to Verify Your Models: Black-Box Text-to-Image Models Verification via Non-Transferable Adversarial Attacks
Ji Guo, Wenbo Jiang, Rui Zhang, Guoming Lu, Hongwei Li, Weiren Wu
Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework
Vladimir Arkhipkin, Viacheslav Vasilev, Andrei Filatov, Igor Pavlov, Julia Agafonova, Nikolai Gerasimenko, Anna Averchenkova, Evelina Mironova, Anton Bukashkin, Konstantin Kulikov, Andrey Kuznetsov, Denis Dimitrov
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models
Weijian Luo, Colin Zhang, Debing Zhang, Zhengyang Geng
Copyright-Aware Incentive Scheme for Generative Art Models Using Hierarchical Reinforcement Learning
Zhuan Shi, Yifei Song, Xiaoli Tang, Lingjuan Lyu, Boi Faltings
Diff-CXR: Report-to-CXR generation through a disease-knowledge enhanced diffusion model
Peng Huang, Bowen Guo, Shuyu Liang, Junhu Fu, Yuanyuan Wang, Yi Guo
Multi-path Exploration and Feedback Adjustment for Text-to-Image Person Retrieval
Bin Kang, Bin Chen, Junjie Wang, Yong Xu