Chip Memory
Chip memory optimization is crucial for accelerating deep learning inference and training, particularly for large models like LLMs and CNNs, aiming to minimize off-chip memory accesses and maximize on-chip storage efficiency. Current research focuses on techniques like optimized scheduling algorithms, configurable memory hierarchies, and data compression methods (e.g., block floating point quantization, arithmetic coding) to reduce memory footprint and improve energy efficiency. These advancements are vital for enabling faster and more power-efficient AI applications across various domains, from mobile devices to high-performance computing in scientific applications.
Papers
December 16, 2024
July 21, 2024
June 16, 2024
April 24, 2024
March 29, 2024
March 27, 2024
March 13, 2024
December 11, 2023
November 30, 2023
November 9, 2023
August 10, 2023
May 4, 2023
November 30, 2022
May 9, 2022
April 15, 2022