Language Model Training
Training large language models (LLMs) involves massive datasets, raising significant concerns about copyright infringement and the inclusion of sensitive personal information. Current research focuses on developing methods to detect plagiarism and copyrighted content within LLMs and their training data, as well as techniques for "unlearning" or removing specific data points to address privacy concerns. These efforts are crucial for ensuring responsible LLM development and deployment, impacting both the legal landscape and the ethical considerations surrounding AI.
Papers
July 2, 2024
June 4, 2024
May 4, 2024
March 22, 2024
February 16, 2024
February 15, 2024
February 13, 2023
October 28, 2022