Paraphrasing Attack
Paraphrasing attacks exploit the ability of language models to generate semantically similar text, undermining watermarking and detection methods designed to identify AI-generated content. Current research focuses on developing more robust watermarking techniques, often employing semantic embeddings and clustering algorithms, while simultaneously exploring improved detection methods that analyze text features beyond simple token-level comparisons. The success of these attacks highlights the ongoing challenge of reliably distinguishing human-written from AI-generated text, with significant implications for combating misinformation, plagiarism, and protecting intellectual property.
Papers
November 8, 2024
August 29, 2024
August 19, 2024
June 6, 2024
May 21, 2024
February 17, 2024
February 2, 2024
November 8, 2023
October 6, 2023
July 28, 2023
May 24, 2023