Secure Inference

Secure inference aims to perform computations on sensitive data (like user inputs or model parameters) without revealing the data itself, addressing privacy concerns in machine learning applications. Current research focuses on optimizing secure inference for large language models (LLMs) and convolutional neural networks (CNNs) using techniques like secure multi-party computation (MPC), homomorphic encryption (HE), and quantization, often targeting specific bottlenecks like nonlinear activation functions or large linear layers. These advancements are crucial for enabling the widespread adoption of privacy-preserving machine learning in various sectors, including healthcare, finance, and collaborative research projects, by mitigating the risks associated with sensitive data exposure during model deployment.

Papers