Attention Based Architecture
Attention-based architectures, particularly transformer networks, are revolutionizing various fields by enabling models to selectively focus on relevant information within complex data. Current research emphasizes improving efficiency, addressing overfitting issues, and enhancing interpretability of these models, exploring variations like hybrid CNN-transformer designs and novel attention mechanisms such as focal and full-range attention. This focus is driven by the need for more efficient, robust, and explainable AI systems across diverse applications, including image processing, natural language processing, and time series forecasting.
Papers
Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction
Ziyang Wu, Tianjiao Ding, Yifu Lu, Druv Pai, Jingyuan Zhang, Weida Wang, Yaodong Yu, Yi Ma, Benjamin D. Haeffele
MRANet: A Modified Residual Attention Networks for Lung and Colon Cancer Classification
Diponkor Bala, S M Rakib Ul Karim, Rownak Ara Rasul