State of the Art LLM
State-of-the-art Large Language Models (LLMs) are rapidly evolving, focusing on improving performance across diverse tasks and domains, including finance, healthcare, and process engineering. Research emphasizes enhancing reasoning capabilities, particularly for multi-step problems, through techniques like incorporating external symbolic working memory and modular architectures with specialized expert models (e.g., Mixture of Experts). These advancements are significant because they enable more reliable and efficient LLM applications, ranging from automating complex processes to providing personalized user experiences and improving access to information in various fields.
Papers
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Ankit Singh Rawat, Veeranjaneyulu Sadhanala, Afshin Rostamizadeh, Ayan Chakrabarti, Wittawat Jitkrittum, Vladimir Feinberg, Seungyeon Kim, Hrayr Harutyunyan, Nikunj Saunshi, Zachary Nado, Rakesh Shivanna, Sashank J. Reddi, Aditya Krishna Menon, Rohan Anil, Sanjiv Kumar
Probing Ranking LLMs: Mechanistic Interpretability in Information Retrieval
Tanya Chowdhury, James Allan
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks
Fangru Lin, Shaoguang Mao, Emanuele La Malfa, Valentin Hofmann, Adrian de Wynter, Xun Wang, Si-Qing Chen, Michael Wooldridge, Janet B. Pierrehumbert, Furu Wei
Advancing Academic Knowledge Retrieval via LLM-enhanced Representation Similarity Fusion
Wei Dai, Peng Fu, Chunjing Gan
Attend First, Consolidate Later: On the Importance of Attention in Different LLM Layers
Amit Ben-Artzy, Roy Schwartz
Sketch: A Toolkit for Streamlining LLM Operations
Xin Jiang, Xiang Li, Wenjia Ma, Xuezhi Fang, Yiqun Yao, Naitong Yu, Xuying Meng, Peng Han, Jing Li, Aixin Sun, Yequan Wang
Understanding LLM Development Through Longitudinal Study: Insights from the Open Ko-LLM Leaderboard
Chanjun Park, Hyeonwoo Kim