Long Context Large Language Model
Long-context large language models (LLMs) aim to overcome the limitations of traditional LLMs by processing significantly longer input sequences, enabling more comprehensive understanding and generation of text. Current research focuses on improving efficiency through techniques like sparse attention mechanisms, optimized memory management (e.g., KV cache compression), and efficient training strategies, as well as developing robust evaluation benchmarks that assess performance on diverse, realistic long-context tasks. This field is crucial for advancing natural language processing capabilities in applications requiring deep understanding of extensive documents, such as multi-document summarization, question answering, and complex reasoning tasks across various domains.
Papers
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
jiajie Zhang, Yushi Bai, Xin Lv, Wanjun Gu, Danqing Liu, Minhao Zou, Shulin Cao, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li
DetectiveQA: Evaluating Long-Context Reasoning on Detective Novels
Zhe Xu, Jiasheng Ye, Xiangyang Liu, Tianxiang Sun, Xiaoran Liu, Qipeng Guo, Linlin Li, Qun Liu, Xuanjing Huang, Xipeng Qiu