Data to Text

Data-to-text generation (D2T) focuses on automatically converting structured data into coherent and accurate natural language text. Current research emphasizes improving the accuracy and fluency of generated text, particularly addressing challenges in low-resource languages and mitigating issues like hallucinations (factual inaccuracies). This involves exploring various model architectures, including transformer-based models and pipeline approaches, often leveraging pre-trained language models and incorporating techniques like re-ranking and constraint optimization to enhance controllability and faithfulness. D2T has significant implications for accelerating scientific discovery, automating report generation, and improving accessibility of information across diverse languages and domains.

Papers