MoE FFD
Mixture-of-Experts (MoE) models are increasingly used to build larger, more efficient deep learning systems, particularly for large language models (LLMs) and multimodal applications like image fusion and face forgery detection. Current research focuses on improving MoE training stability, optimizing expert load balancing for efficient resource utilization, and developing novel architectures like "Soft MoE" to address limitations of traditional sparse MoE approaches. This work is significant because it allows for scaling model capacity without a proportional increase in computational cost, leading to improved performance in various tasks while maintaining efficiency.
Papers
November 3, 2024
October 25, 2024
October 9, 2024
September 9, 2024
May 23, 2024
April 25, 2024
April 12, 2024
August 2, 2023