Long Text
Research on long text focuses on enabling large language models (LLMs) to effectively process and generate extended textual content, overcoming limitations of traditional transformer architectures. Current efforts concentrate on improving efficiency through optimized tokenization, novel attention mechanisms (like sparse attention and multi-kernel transformers), and techniques for semantic compression to handle longer sequences. This work is crucial for advancing numerous NLP applications, including improved machine translation, relation extraction from lengthy documents, and more accurate and efficient factual text generation.
Papers
PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents
Simeng Sun, Yang Liu, Shuohang Wang, Chenguang Zhu, Mohit Iyyer
FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Wei Koh, Mohit Iyyer, Luke Zettlemoyer, Hannaneh Hajishirzi