Mixture of Expert
Mixture-of-Experts (MoE) models aim to improve the efficiency and scalability of large language and other models by using multiple specialized "expert" networks, each handling a subset of the input data. Current research focuses on improving routing algorithms to efficiently assign inputs to experts, developing heterogeneous MoE architectures with experts of varying sizes and capabilities, and optimizing training methods to address challenges like load imbalance and gradient conflicts. This approach holds significant promise for creating larger, more powerful models with reduced computational costs, impacting various fields from natural language processing and computer vision to robotics and scientific discovery.
Papers
Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language Understanding
Ukyo Honda, Tatsushi Oka, Peinan Zhang, Masato Mita
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
DeepSeek-AI, Qihao Zhu, Daya Guo, Zhihong Shao, Dejian Yang, Peiyi Wang, Runxin Xu, Y. Wu, Yukun Li, Huazuo Gao, Shirong Ma, Wangding Zeng, Xiao Bi, Zihui Gu, Hanwei Xu, Damai Dai, Kai Dong, Liyue Zhang, Yishi Piao, Zhibin Gou, Zhenda Xie, Zhewen Hao, Bingxuan Wang, Junxiao Song, Deli Chen, Xin Xie, Kang Guan, Yuxiang You, Aixin Liu, Qiushi Du, Wenjun Gao, Xuan Lu, Qinyu Chen, Yaohui Wang, Chengqi Deng, Jiashi Li, Chenggang Zhao, Chong Ruan, Fuli Luo, Wenfeng Liang
$\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts
Guanjie Chen, Xinyu Zhao, Tianlong Chen, Yu Cheng
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
Tong Zhu, Daize Dong, Xiaoye Qu, Jiacheng Ruan, Wenliang Chen, Yu Cheng
Interpretable Cascading Mixture-of-Experts for Urban Traffic Congestion Prediction
Wenzhao Jiang, Jindong Han, Hao Liu, Tao Tao, Naiqiang Tan, Hui Xiong
MoME: Mixture of Multimodal Experts for Cancer Survival Prediction
Conghao Xiong, Hao Chen, Hao Zheng, Dong Wei, Yefeng Zheng, Joseph J. Y. Sung, Irwin King