Training Inference

Training inference discrepancy, a significant challenge in various machine learning models, focuses on aligning the training process with the inference (prediction) phase to improve model performance and efficiency. Current research emphasizes techniques like knowledge distillation (e.g., using Kullback-Leibler divergence) to create smaller, faster student models from larger teacher models, and strategies to mitigate training-inference mismatches in different architectures, including diffusion models and transformers, often by adjusting initialization, sampling methods, or loss functions. Addressing this discrepancy is crucial for deploying efficient and accurate models in diverse applications, ranging from image and video generation to natural language processing and medical diagnosis.

Papers