Hidden Knowledge
Hidden knowledge research explores the latent information and capabilities embedded within complex systems, particularly machine learning models, aiming to understand, extract, and mitigate their implications. Current research focuses on detecting hidden biases and vulnerabilities in models like LLMs and neural networks, employing techniques such as steganalysis, quiver representation theory, and contrastive learning to analyze hidden activations and emergent behaviors. This work is crucial for enhancing model safety, improving interpretability, and addressing concerns about fairness and security in various applications, from medical diagnosis to autonomous systems.
Papers
February 8, 2024
December 6, 2023
November 23, 2023
November 22, 2023
November 21, 2023
October 10, 2023
September 27, 2023
August 13, 2023
July 27, 2023
July 14, 2023
July 11, 2023
June 30, 2023
June 15, 2023
May 29, 2023
May 13, 2023
April 24, 2023
April 20, 2023
March 7, 2023
January 26, 2023