Internal Representation

Internal representations in artificial neural networks, particularly large language models (LLMs) and vision-language models (VLMs), are the focus of intense research aimed at understanding how these models process information and generate outputs. Current work investigates the nature of these representations, exploring their structure, how they encode knowledge (both parametric and non-parametric), and how they relate to model behaviors like hallucination and reasoning. This research utilizes techniques like probing classifiers, activation analysis, and tensor decomposition to analyze internal states and improve model performance and reliability. Understanding internal representations is crucial for enhancing model interpretability, mitigating biases and errors, and ultimately building more robust and trustworthy AI systems.

Papers