Paper ID: 2204.01942

Fault-Tolerant Deep Learning: A Hierarchical Perspective

Cheng Liu, Zhen Gao, Siting Liu, Xuefei Ning, Huawei Li, Xiaowei Li

With the rapid advancements of deep learning in the past decade, it can be foreseen that deep learning will be continuously deployed in more and more safety-critical applications such as autonomous driving and robotics. In this context, reliability turns out to be critical to the deployment of deep learning in these applications and gradually becomes a first-class citizen among the major design metrics like performance and energy efficiency. Nevertheless, the back-box deep learning models combined with the diverse underlying hardware faults make resilient deep learning extremely challenging. In this special session, we conduct a comprehensive survey of fault-tolerant deep learning design approaches with a hierarchical perspective and investigate these approaches from model layer, architecture layer, circuit layer, and cross layer respectively.

Submitted: Apr 5, 2022