Mixture Component
Mixture component models are a powerful class of machine learning techniques that combine multiple specialized models (experts) to improve performance and efficiency on complex tasks. Current research focuses on developing novel architectures, such as mixtures of experts (MoE), and applying them to diverse fields including natural language processing, computer vision, and signal processing, often incorporating techniques like low-rank adaptation (LoRA) for parameter efficiency. These advancements are significant because they enable the creation of larger, more capable models while mitigating computational costs and improving generalization across heterogeneous datasets, leading to improved accuracy and efficiency in various applications.
Papers
BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Qizhen Zhang, Nikolas Gritsch, Dwaraknath Gnaneshwar, Simon Guo, David Cairuz, Bharat Venkitesh, Jakob Foerster, Phil Blunsom, Sebastian Ruder, Ahmet Ustun, Acyr Locatelli
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models
Zhongyu Zhao, Menghang Dong, Rongyu Zhang, Wenzhao Zheng, Yunpeng Zhang, Huanrui Yang, Dalong Du, Kurt Keutzer, Shanghang Zhang
UniFed: A Universal Federation of a Mixture of Highly Heterogeneous Medical Image Classification Tasks
Atefe Hassani, Islem Rekik
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Gagan Jain, Nidhi Hegde, Aditya Kusupati, Arsha Nagrani, Shyamal Buch, Prateek Jain, Anurag Arnab, Sujoy Paul
RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models
Pengkun Jiao, Xinlan Wu, Bin Zhu, Jingjing Chen, Chong-Wah Ngo, Yugang Jiang
MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
Leyang Shen, Gongwei Chen, Rui Shao, Weili Guan, Liqiang Nie
MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from Demonstrations
Vignesh Prasad, Alap Kshirsagar, Dorothea Koert, Ruth Stock-Homburg, Jan Peters, Georgia Chalvatzaki
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
Wanggui He, Siming Fu, Mushui Liu, Xierui Wang, Wenyi Xiao, Fangxun Shu, Yi Wang, Lei Zhang, Zhelun Yu, Haoyuan Li, Ziwei Huang, LeiLei Gan, Hao Jiang