Image Decoder
Image decoders are neural network components that reconstruct complex data, such as images or visual features, from compressed or encoded representations. Current research emphasizes improving decoder architectures, often employing transformers, diffusion models, or autoencoders, to enhance reconstruction quality, handle diverse data modalities (e.g., vision and language), and address challenges like limited data or out-of-distribution samples. These advancements are significant for various applications, including brain-computer interfaces, scene text recognition, and image compression, by enabling more accurate and efficient data processing and generation.
Papers
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu, Baining Guo
Decoupling Visual-Semantic Feature Learning for Robust Scene Text Recognition
Changxu Cheng, Bohan Li, Qi Zheng, Yongpan Wang, Wenyu Liu