Hidden Representation
Hidden representations, the internal data structures generated by neural networks at intermediate layers, are a key focus in understanding and improving AI models. Current research investigates how these representations encode information, focusing on their geometric properties, their role in various tasks (like language modeling and image processing), and their manipulation for controlling model behavior (e.g., through activation steering or contrastive instruction tuning). Understanding hidden representations is crucial for enhancing model interpretability, improving robustness to adversarial attacks and out-of-distribution data, and developing more efficient and reliable AI systems across diverse applications.
Papers
November 4, 2024
October 15, 2024
September 21, 2024
September 19, 2024
September 5, 2024
August 15, 2024
June 17, 2024
June 16, 2024
June 12, 2024
May 27, 2024
February 22, 2024
February 17, 2024
February 14, 2024
February 6, 2024
January 11, 2024
December 29, 2023
December 5, 2023
December 1, 2023
November 16, 2023