Knowledge Capacity
Knowledge capacity research investigates how much information a model, particularly machine learning models like large language models (LLMs), can effectively store and utilize. Current efforts focus on developing robust evaluation methods, such as perturbation techniques, to assess true knowledge retention beyond superficial memorization, and on understanding the relationship between model architecture (e.g., GPT-2, LLaMA), training parameters (data size, duration), and the resulting knowledge storage capacity. These investigations are crucial for improving model performance and trustworthiness, with implications for various applications ranging from manufacturing process monitoring to personalized recommendation systems.
Papers
July 1, 2024
May 30, 2024
April 8, 2024
May 27, 2023
June 7, 2022