Image Generation
Image generation research focuses on creating realistic and diverse images from various inputs, such as text, sketches, or other images, aiming for greater control and efficiency. Current efforts center on refining diffusion and autoregressive models, exploring techniques like dynamic computation, disentangled feature representation, and multimodal integration to improve image quality, controllability, and computational efficiency. These advancements have significant implications for accessible communication, creative content production, and various computer vision tasks, offering powerful tools for both scientific investigation and practical applications. Ongoing work addresses challenges like handling multiple conditions, improving evaluation metrics, and mitigating biases and limitations in existing models.
Papers
M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation
Sucheng Ren, Yaodong Yu, Nataniel Ruiz, Feng Wang, Alan Yuille, Cihang Xie
CART: Compositional Auto-Regressive Transformer for Image Generation
Siddharth Roheda
Content-Aware Preserving Image Generation
Giang H. Le, Anh Q. Nguyen, Byeongkeun Kang, Yeejin Lee
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
NVIDIA: Yuval Atzmon, Maciej Bala, Yogesh Balaji, Tiffany Cai, Yin Cui, Jiaojiao Fan, Yunhao Ge, Siddharth Gururani, Jacob Huffman, Ronald Isaac, Pooya Jannaty, Tero Karras, Grace Lam, J. P. Lewis, Aaron Licata, Yen-Chen Lin, Ming-Yu Liu, Qianli Ma, Arun Mallya, Ashlee Martino-Tarr, Doug Mendez, Seungjun Nah, Chris Pruett, Fitsum Reda, Jiaming Song, Ting-Chun Wang, Fangyin Wei, Xiaohui Zeng, Yu Zeng, Qinsheng Zhang
Layout Control and Semantic Guidance with Attention Loss Backward for T2I Diffusion Model
Guandong Li
DomainGallery: Few-shot Domain-driven Image Generation by Attribute-centric Finetuning
Yuxuan Duan, Yan Hong, Bo Zhang, Jun Lan, Huijia Zhu, Weiqiang Wang, Jianfu Zhang, Li Niu, Liqing Zhang
Image Understanding Makes for A Good Tokenizer for Image Generation
Luting Wang, Yang Zhao, Zijian Zhang, Jiashi Feng, Si Liu, Bingyi Kang
GazeGen: Gaze-Driven User Interaction for Visual Content Generation
He-Yen Hsieh, Ziyun Li, Sai Qian Zhang, Wei-Te Mark Ting, Kao-Den Chang, Barbara De Salvo, Chiao Liu, H. T. Kung
Enhancing Weakly Supervised Semantic Segmentation for Fibrosis via Controllable Image Generation
Zhiling Yue, Yingying Fang, Liutao Yang, Nikhil Baid, Simon Walsh, Guang Yang
DiT4Edit: Diffusion Transformer for Image Editing
Kunyu Feng, Yue Ma, Bingyuan Wang, Chenyang Qi, Haozhe Chen, Qifeng Chen, Zeyu Wang
GenXD: Generating Any 3D and 4D Scenes
Yuyang Zhao, Chung-Ching Lin, Kevin Lin, Zhiwen Yan, Linjie Li, Zhengyuan Yang, Jianfeng Wang, Gim Hee Lee, Lijuan Wang
Hunyuan3D-1.0: A Unified Framework for Text-to-3D and Image-to-3D Generation
Xianghui Yang, Huiwen Shi, Bowen Zhang, Fan Yang, Jiacheng Wang, Hongxu Zhao, Xinhai Liu, Xinzhou Wang, Qingxiang Lin, Jiaao Yu, Lifu Wang, Zhuo Chen, Sicong Liu, Yuhong Liu, Yong Yang, Di Wang, Jie Jiang, Chunchao Guo