LLM Representation
Large language model (LLM) representations are the internal data structures encoding information processed by LLMs, a key area of research aiming to understand how these models function and improve their performance. Current research focuses on enhancing these representations through techniques like localized fine-tuning, knowledge base integration, and adapting model architectures (e.g., using transformers with dynamic compression) to handle diverse input modalities and lengths. Understanding and manipulating LLM representations is crucial for improving model accuracy, trustworthiness, and interpretability, with implications for various applications including text generation, question answering, and recommendation systems.
Papers
Establishing Vocabulary Tests as a Benchmark for Evaluating Large Language Models
Gonzalo Martínez, Javier Conde, Elena Merino-Gómez, Beatriz Bermúdez-Margaretto, José Alberto Hernández, Pedro Reviriego, Marc Brysbaert
Evaluating Spatial Understanding of Large Language Models
Yutaro Yamada, Yihan Bao, Andrew K. Lampinen, Jungo Kasai, Ilker Yildirim