Document Summary Pair

Document summary pairs, comprising a source document and its corresponding summary, are central to advancing automatic summarization. Research focuses on improving the faithfulness and quality of these pairs, addressing issues like hallucination (where summaries contain unsupported information) and developing multilingual datasets to overcome the current English-language bias. This involves exploring novel training methods, such as contrastive learning and unlikelihood loss, and leveraging graph-based representations to capture document relationships more effectively. These advancements are crucial for enhancing the accuracy and applicability of summarization models across diverse languages and document types.

Papers