Recent Large Language Model
Recent research on large language models (LLMs) centers on improving their capabilities in handling long contexts, multilingual support, and complex reasoning tasks, while also addressing limitations in efficiency, bias, and uncertainty quantification. Current efforts focus on novel architectures like Mamba, enhanced Mixture of Experts models, and improved training methods such as self-contrast learning and fine-grained reward systems. These advancements are crucial for expanding the practical applications of LLMs across diverse fields, from biomedical research and public health interventions to improving the reliability of AI-assisted tools and mitigating the risks associated with misinformation.
Papers
AutoPlan: Automatic Planning of Interactive Decision-Making Tasks With Large Language Models
Siqi Ouyang, Lei Li
Towards Reliable Misinformation Mitigation: Generalization, Uncertainty, and GPT-4
Kellin Pelrine, Anne Imouza, Camille Thibault, Meilina Reksoprodjo, Caleb Gupta, Joel Christoph, Jean-François Godbout, Reihaneh Rabbany
Getting MoRE out of Mixture of Language Model Reasoning Experts
Chenglei Si, Weijia Shi, Chen Zhao, Luke Zettlemoyer, Jordan Boyd-Graber