Human Authored Text

Research on human-authored text currently focuses on distinguishing it from text generated by large language models (LLMs), employing techniques like stylometric analysis and conditional probability curvature to identify subtle differences in writing style and word choice. These efforts utilize various machine learning models, including Random Forests and RoBERTa, to classify text origin and explore metrics like "recoverability" to quantify the divergence between human and machine-generated text. This research is crucial for mitigating the risks associated with AI-generated misinformation and plagiarism, impacting fields ranging from cybersecurity to academic integrity.

Papers