Split Computing

Split computing addresses the challenges of deploying large deep neural networks (DNNs) on resource-constrained devices by partitioning the model between a local device and a remote server. Current research focuses on optimizing model architectures and training strategies for efficient splitting, including techniques like predefined sparsity, multi-task learning, and the use of bottlenecks or slimmable encoders to minimize communication overhead and computational load. This approach is significant for improving the performance and privacy of AI applications in various domains, such as edge computing, IoT devices, and mobile platforms, by enabling the deployment of powerful DNNs while mitigating latency, bandwidth, and security concerns.

Papers