Long Text
Research on long text focuses on enabling large language models (LLMs) to effectively process and generate extended textual content, overcoming limitations of traditional transformer architectures. Current efforts concentrate on improving efficiency through optimized tokenization, novel attention mechanisms (like sparse attention and multi-kernel transformers), and techniques for semantic compression to handle longer sequences. This work is crucial for advancing numerous NLP applications, including improved machine translation, relation extraction from lengthy documents, and more accurate and efficient factual text generation.
Papers
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Yiran Ding, Li Lyna Zhang, Chengruidong Zhang, Yuanyuan Xu, Ning Shang, Jiahang Xu, Fan Yang, Mao Yang
LongWanjuan: Towards Systematic Measurement for Long Text Quality
Kai Lv, Xiaoran Liu, Qipeng Guo, Hang Yan, Conghui He, Xipeng Qiu, Dahua Lin