PIXART $\Alpha$

PixArt-α is a family of transformer-based diffusion models designed for efficient and high-quality text-to-image synthesis. Research focuses on improving image resolution, generation speed, and controllability through techniques like weak-to-strong training, latent consistency models, and novel attention mechanisms within the diffusion transformer architecture. These advancements significantly reduce training costs and computational resources required for generating photorealistic images, making high-quality text-to-image generation more accessible to researchers and developers while minimizing environmental impact. The resulting models offer a compelling alternative to existing state-of-the-art methods.

Papers

March 7, 2024

PixArt-\Sigma: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Junsong Chen, Chongjian Ge, Enze Xie, Yue Wu, Lewei Yao, Xiaozhe Ren, Zhongdao Wang, Ping Luo, Huchuan Lu, Zhenguo Li
Text to Image Diffusion Model Text to Image Generation High Resolution Image Diffusion Transformer Weak to Strong High Resolution Image Generation PIXART $\Alpha$

January 10, 2024

PIXART-{\delta}: Fast and Controllable Image Generation with Latent Consistency Models
Junsong Chen, Yue Wu, Simian Luo, Enze Xie, Sayak Paul, Ping Luo, Hang Zhao, Zhenguo Li
Image Generation Text to Image Diffusion Model Text to Image Synthesis Latent Consistency Model PIXART $\Alpha$

September 30, 2023

PixArt-$\alpha$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
Junsong Chen, Jincheng Yu, Chongjian Ge, Lewei Yao, Enze Xie, Yue Wu, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, Zhenguo Li
Text to Image Diffusion Transformer Image Text Alignment Faster Training T2I Diffusion Model PIXART $\Alpha$

PIXART $\Alpha$

Papers

PixArt-\Sigma: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

PIXART-{\delta}: Fast and Controllable Image Generation with Latent Consistency Models

PixArt-$\alpha$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis