Transformer Megatron Decepticons
Transformer models are being extensively investigated for various sequence processing tasks, moving beyond natural language processing to encompass time series forecasting, image recognition, and scientific computing applications like solving partial differential equations. Current research focuses on improving efficiency (e.g., through mixed-precision quantization and optimized architectures), enhancing generalization capabilities (particularly to longer sequences), and understanding the underlying mechanisms of in-context learning. These advancements have significant implications for diverse fields, improving the accuracy and efficiency of numerous applications while simultaneously deepening our theoretical understanding of these powerful models.
Papers
Are Transformers Effective for Time Series Forecasting?
Ailing Zeng, Muxi Chen, Lei Zhang, Qiang Xu
Towards Learning Universal Hyperparameter Optimizers with Transformers
Yutian Chen, Xingyou Song, Chansoo Lee, Zi Wang, Qiuyi Zhang, David Dohan, Kazuya Kawakami, Greg Kochanski, Arnaud Doucet, Marc'aurelio Ranzato, Sagi Perel, Nando de Freitas
Multimodal Indoor Localisation for Measuring Mobility in Parkinson's Disease using Transformers
Ferdian Jovan, Ryan McConville, Catherine Morgan, Emma Tonkin, Alan Whone, Ian Craddock
DTW at Qur'an QA 2022: Utilising Transfer Learning with Transformers for Question Answering in a Low-resource Domain
Damith Premasiri, Tharindu Ranasinghe, Wajdi Zaghouani, Ruslan Mitkov