Deep Learning Workload

Deep learning workloads encompass the computational demands of training and deploying large-scale neural networks, focusing on optimizing performance, resource utilization, and energy efficiency. Current research emphasizes efficient resource allocation and scheduling across heterogeneous hardware (CPUs, GPUs, NPUs, FPGAs), exploring techniques like model parallelism, dataflow awareness, and smart memory management to handle increasingly complex models such as Vision Transformers and large language models. These advancements are crucial for enabling the wider adoption of deep learning in various applications, from edge AI devices to massive cloud-based training clusters, by improving both speed and cost-effectiveness.

Papers