Text to Image Model
Text-to-image models generate images from textual descriptions, aiming to achieve high fidelity, creativity, and safety. Current research focuses on improving image-text alignment, mitigating biases and safety issues (like generating harmful content or being vulnerable to jailbreaks), and enhancing model generalizability and efficiency through techniques such as diffusion models, fine-tuning strategies, and vector quantization. These advancements have significant implications for various fields, including art, design, and medical imaging, but also raise ethical concerns regarding bias, safety, and potential misuse requiring ongoing investigation and development of robust mitigation strategies.
Papers
StyleDrop: Text-to-Image Generation in Any Style
Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, Dilip Krishnan
StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners
Yonglong Tian, Lijie Fan, Phillip Isola, Huiwen Chang, Dilip Krishnan
ReFACT: Updating Text-to-Image Models by Editing the Text Encoder
Dana Arad, Hadas Orgad, Yonatan Belinkov
High-Fidelity Image Compression with Score-based Generative Models
Emiel Hoogeboom, Eirikur Agustsson, Fabian Mentzer, Luca Versari, George Toderici, Lucas Theis
Stereotypes and Smut: The (Mis)representation of Non-cisgender Identities by Text-to-Image Models
Eddie L. Ungless, Björn Ross, Anne Lauscher
Data Redaction from Conditional Generative Models
Zhifeng Kong, Kamalika Chaudhuri
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Yujie Lu, Xianjun Yang, Xiujun Li, Xin Eric Wang, William Yang Wang
Inspecting the Geographical Representativeness of Images from Text-to-Image Models
Abhipsa Basu, R. Venkatesh Babu, Danish Pruthi