Mixture Component
Mixture component models are a powerful class of machine learning techniques that combine multiple specialized models (experts) to improve performance and efficiency on complex tasks. Current research focuses on developing novel architectures, such as mixtures of experts (MoE), and applying them to diverse fields including natural language processing, computer vision, and signal processing, often incorporating techniques like low-rank adaptation (LoRA) for parameter efficiency. These advancements are significant because they enable the creation of larger, more capable models while mitigating computational costs and improving generalization across heterogeneous datasets, leading to improved accuracy and efficiency in various applications.
Papers
Sample-Efficient Private Learning of Mixtures of Gaussians
Hassan Ashtiani, Mahbod Majid, Shyam Narayanan
Predicting the Temperature-Dependent CMC of Surfactant Mixtures with Graph Neural Networks
Christoforos Brozos, Jan G. Rittig, Sandip Bhattacharya, Elie Akanny, Christina Kohlmann, Alexander Mitsos
MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition
Cheng Yang, Yang Sui, Jinqi Xiao, Lingyi Huang, Yu Gong, Yuanlin Duan, Wenqi Jia, Miao Yin, Yu Cheng, Bo Yuan
Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Prior
Mingxuan Zhang, Yan Sun, Faming Liang
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models
Nam V. Nguyen, Thong T. Doan, Luong Tran, Van Nguyen, Quang Pham
Efficient and Interpretable Grammatical Error Correction with Mixture of Experts
Muhammad Reza Qorib, Alham Fikri Aji, Hwee Tou Ng
MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Jie Zhu, Yixiong Chen, Mingyu Ding, Ping Luo, Leye Wang, Jingdong Wang
Stealing User Prompts from Mixture of Experts
Itay Yona, Ilia Shumailov, Jamie Hayes, Nicholas Carlini
MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning
Xujia Wang, Haiyan Zhao, Shuo Wang, Hanqing Wang, Zhiyuan Liu
Structured Diffusion Models with Mixture of Gaussians as Prior Distribution
Nanshan Jia, Tingyu Zhu, Haoyu Liu, Zeyu Zheng
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi, Clara Mohri, David Brandfonbrener, Alex Gu, Nikhil Vyas, Nikhil Anand, David Alvarez-Melis, Yuanzhi Li, Sham M. Kakade, Eran Malach