Paraphrase Generation
Paraphrase generation, the task of rewriting text while preserving meaning, is a core area of natural language processing research focused on improving both the quality and diversity of generated text. Current research emphasizes leveraging large language models (LLMs) and diffusion models, often incorporating techniques like knowledge distillation, in-context learning, and syntactic control to enhance generation capabilities and address challenges such as hallucination and maintaining semantic consistency. This field is crucial for applications ranging from improving the accessibility of complex texts to mitigating the spread of harmful or misleading AI-generated content, and its advancements are driving progress in various NLP tasks.
Papers
PIP: Parse-Instructed Prefix for Syntactically Controlled Paraphrase Generation
Yixin Wan, Kuan-Hao Huang, Kai-Wei Chang
Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing
Jaehun Jung, Peter West, Liwei Jiang, Faeze Brahman, Ximing Lu, Jillian Fisher, Taylor Sorensen, Yejin Choi
ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR Back-Translation
Kuan-Hao Huang, Varun Iyer, I-Hung Hsu, Anoop Kumar, Kai-Wei Chang, Aram Galstyan