Cross Entropy Loss
Cross-entropy loss is a widely used objective function in machine learning, primarily for training classification models by minimizing the difference between predicted and true probability distributions. Current research focuses on addressing its limitations, particularly in large-scale applications like recommender systems and large language models, where modifications like scalable or reduced cross-entropy are being developed to improve efficiency and memory usage. Furthermore, research explores alternative loss functions or combinations with other methods (e.g., contrastive learning, Wasserstein loss) to enhance model performance, calibration, and robustness, especially in scenarios with limited data or imbalanced classes. These advancements have significant implications for improving the accuracy, efficiency, and reliability of various machine learning applications.
Papers
Scaling Laws for Multilingual Language Models
Yifei He, Alon Benhaim, Barun Patra, Praneetha Vaddamanu, Sanchit Ahuja, Parul Chopra, Vishrav Chaudhary, Han Zhao, Xia Song
Improving Bias in Facial Attribute Classification: A Combined Impact of KL Divergence induced Loss Function and Dual Attention
Shweta Patel, Dakshina Ranjan Kisku