Paper ID: 2402.01642
Detection of Machine-Generated Text: Literature Survey
Dmytro Valiaiev
Since language models produce fake text quickly and easily, there is an oversupply of such content in the public domain. The degree of sophistication and writing style has reached a point where differentiating between human authored and machine-generated content is nearly impossible. As a result, works generated by language models rather than human authors have gained significant media attention and stirred controversy.Concerns regarding the possible influence of advanced language models on society have also arisen, needing a fuller knowledge of these processes. Natural language generation (NLG) and generative pre-trained transformer (GPT) models have revolutionized a variety of sectors: the scope not only permeated throughout journalism and customer service but also reached academia. To mitigate the hazardous implications that may arise from the use of these models, preventative measures must be implemented, such as providing human agents with the capacity to distinguish between artificially made and human composed texts utilizing automated systems and possibly reverse-engineered language models. Furthermore, to ensure a balanced and responsible approach, it is critical to have a full grasp of the socio-technological ramifications of these breakthroughs. This literature survey aims to compile and synthesize accomplishments and developments in the aforementioned work, while also identifying future prospects. It also gives an overview of machine-generated text trends and explores the larger societal implications. Ultimately, this survey intends to contribute to the development of robust and effective approaches for resolving the issues connected with the usage and detection of machine-generated text by exploring the interplay between the capabilities of language models and their possible implications.
Submitted: Jan 2, 2024