Private Inference

Private inference (PI) aims to perform computations on encrypted data, protecting both user data and model parameters during machine learning inference. Current research focuses on improving the efficiency of PI, particularly for large language models (LLMs) and vision transformers (ViTs), by optimizing communication, reducing computational overhead (e.g., through approximation of non-linear functions like ReLU), and developing novel algorithms like adaptive PI and layer-wise approximation techniques. These advancements are crucial for enabling widespread adoption of privacy-preserving machine learning in cloud-based services and other sensitive applications.

Papers