Machine Generated
Machine-generated text detection focuses on distinguishing computer-generated content from human-written text, driven by the increasing sophistication of large language models (LLMs). Current research emphasizes developing robust and generalizable detection methods, often employing transformer-based architectures and exploring techniques like watermarking, rewriting analysis, and multi-modal approaches (combining text, image, and audio data). This field is crucial for mitigating the risks of misinformation, plagiarism, and other forms of malicious use of LLMs, impacting various sectors including journalism, education, and online content moderation.
Papers
AIpom at SemEval-2024 Task 8: Detecting AI-produced Outputs in M4
Alexander Shirnin, Nikita Andreev, Vladislav Mikhailov, Ekaterina Artemova
MUGC: Machine Generated versus User Generated Content Detection
Yaqi Xie, Anjali Rawal, Yujing Cen, Dixuan Zhao, Sunil K Narang, Shanu Sushmita
GenAI Detection Tools, Adversarial Techniques and Implications for Inclusivity in Higher Education
Mike Perkins, Jasper Roe, Binh H. Vu, Darius Postma, Don Hickerson, James McGaughran, Huy Q. Khuat
k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text
Abe Bohan Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, Tianxing He
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohanned Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov