Autoregressive Image Generation
Autoregressive image generation aims to create images by sequentially predicting pixels or latent representations, mimicking the way humans might draw or paint. Current research focuses on improving the speed and controllability of these models, exploring architectures like transformers and state-space models to achieve efficient long-sequence modeling and incorporating techniques like wavelet transforms and various quantization methods for improved image quality and reduced computational cost. These advancements are significant because they offer a powerful alternative to diffusion models, potentially leading to faster and more controllable image generation for applications ranging from artistic creation to medical imaging.
Papers
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
Liang Chen, Sinan Tan, Zefan Cai, Weichu Xie, Haozhe Zhao, Yichi Zhang, Junyang Lin, Jinze Bai, Tianyu Liu, Baobao Chang
ImageFolder: Autoregressive Image Generation with Folded Tokens
Xiang Li, Hao Chen, Kai Qiu, Jason Kuen, Jiuxiang Gu, Bhiksha Raj, Zhe Lin