Image Generation
Image generation research focuses on creating realistic and diverse images from various inputs, such as text, sketches, or other images, aiming for greater control and efficiency. Current efforts center on refining diffusion and autoregressive models, exploring techniques like dynamic computation, disentangled feature representation, and multimodal integration to improve image quality, controllability, and computational efficiency. These advancements have significant implications for accessible communication, creative content production, and various computer vision tasks, offering powerful tools for both scientific investigation and practical applications. Ongoing work addresses challenges like handling multiple conditions, improving evaluation metrics, and mitigating biases and limitations in existing models.
Papers
LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation
Donald Shenaj, Ondrej Bohdal, Mete Ozay, Pietro Zanuttigh, Umberto Michieli
The Silent Prompt: Initial Noise as Implicit Guidance for Goal-Driven Image Generation
Ruoyu Wang, Huayang Huang, Ye Zhu, Olga Russakovsky, Yu Wu
LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors
Yusuf Dalva, Yijun Li, Qing Liu, Nanxuan Zhao, Jianming Zhang, Zhe Lin, Pinar Yanardag
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
Jian Han, Jinlai Liu, Yi Jiang, Bin Yan, Yuqi Zhang, Zehuan Yuan, Bingyue Peng, Xiaobing Liu
Liquid: Language Models are Scalable Multi-modal Generators
Junfeng Wu, Yi Jiang, Chuofan Ma, Yuliang Liu, Hengshuang Zhao, Zehuan Yuan, Song Bai, Xiang Bai
ZipAR: Accelerating Auto-regressive Image Generation through Spatial Locality
Yefei He, Feng Chen, Yuanyu He, Shaoxuan He, Hong Zhou, Kaipeng Zhang, Bohan Zhuang
CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation
Hui Zhang, Dexiang Hong, Tingwei Gao, Yitong Wang, Jie Shao, Xinglong Wu, Zuxuan Wu, Yu-Gang Jiang
PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation
Qihan Huang, Long Chan, Jinlong Liu, Wanggui He, Hao Jiang, Mingli Song, Jie Song
Partially Conditioned Patch Parallelism for Accelerated Diffusion Model Inference
XiuYu Zhang, Zening Luo, Michelle E. Lu
Panoptic Diffusion Models: co-generation of images and segmentation maps
Yinghan Long, Kaushik Roy