Multi DNN Scheduling

Multi-DNN scheduling focuses on efficiently managing the execution of multiple deep neural networks (DNNs) across various hardware platforms, aiming to optimize resource utilization, minimize latency, and improve predictability of job completion times. Current research emphasizes techniques like asynchronous processing in spiking neural networks (SNNs) and efficient memory management strategies such as block swapping to overcome memory limitations on edge devices. These advancements are crucial for enabling the deployment of increasingly complex DNNs, including large language models, on resource-constrained devices and in large-scale datacenters, ultimately improving the efficiency and scalability of AI applications.

Papers