Hardware Error
Hardware errors pose a significant threat to the reliability of increasingly complex computing systems, particularly those employing deep learning models and high-performance computing architectures. Current research focuses on developing mitigation strategies, including reinforcement learning algorithms for adaptive error correction and novel hardware designs optimized for resilience against faults, such as specialized accelerators for spiking neural networks. These efforts are crucial for ensuring the dependable operation of AI systems in safety-critical applications and for maximizing the efficiency of large-scale computing infrastructures.
Papers
July 23, 2024
February 5, 2024
September 28, 2023
September 27, 2023
July 17, 2023
April 4, 2023
March 13, 2023
February 8, 2023
December 7, 2022
November 11, 2022
October 10, 2022
October 8, 2022
September 7, 2022
August 8, 2022