Hidden Knowledge
Hidden knowledge research explores the latent information and capabilities embedded within complex systems, particularly machine learning models, aiming to understand, extract, and mitigate their implications. Current research focuses on detecting hidden biases and vulnerabilities in models like LLMs and neural networks, employing techniques such as steganalysis, quiver representation theory, and contrastive learning to analyze hidden activations and emergent behaviors. This work is crucial for enhancing model safety, improving interpretability, and addressing concerns about fairness and security in various applications, from medical diagnosis to autonomous systems.
Papers
July 14, 2023
July 11, 2023
June 30, 2023
June 15, 2023
May 29, 2023
May 13, 2023
April 24, 2023
April 20, 2023
March 7, 2023
January 26, 2023
January 12, 2023
December 3, 2022
November 19, 2022
November 1, 2022
October 25, 2022
October 17, 2022
October 7, 2022
July 20, 2022
June 16, 2022