Large Scale Transformer
Large-scale Transformers are powerful deep learning models achieving state-of-the-art results across diverse tasks, but their immense computational demands drive research into efficient training and deployment strategies. Current efforts focus on optimizing architectures (e.g., multi-path designs, sparse matrix multiplication) and developing novel training algorithms (like direct feedback alignment and federated learning) to reduce resource consumption while maintaining accuracy. These advancements are crucial for broadening access to the capabilities of large Transformers and enabling their application in resource-constrained environments and diverse fields like EEG analysis and question answering.
Papers
October 31, 2024
September 1, 2024
June 19, 2024
April 24, 2024
April 12, 2024
April 2, 2024
February 14, 2024
February 8, 2024
October 17, 2023
August 20, 2023
May 24, 2023
May 10, 2023
February 1, 2023
November 23, 2022
November 17, 2022
June 4, 2022
January 15, 2022