Internal Representation
Internal representations in artificial neural networks, particularly large language models (LLMs) and vision-language models (VLMs), are the focus of intense research aimed at understanding how these models process information and generate outputs. Current work investigates the nature of these representations, exploring their structure, how they encode knowledge (both parametric and non-parametric), and how they relate to model behaviors like hallucination and reasoning. This research utilizes techniques like probing classifiers, activation analysis, and tensor decomposition to analyze internal states and improve model performance and reliability. Understanding internal representations is crucial for enhancing model interpretability, mitigating biases and errors, and ultimately building more robust and trustworthy AI systems.
Papers
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Javier Ferrando, Oscar Obeso, Senthooran Rajamanoharan, Neel Nanda
Understanding World or Predicting Future? A Comprehensive Survey of World Models
Jingtao Ding, Yunke Zhang, Yu Shang, Yuheng Zhang, Zefang Zong, Jie Feng, Yuan Yuan, Hongyuan Su, Nian Li, Nicholas Sukiennik, Fengli Xu, Yong Li