Long Context
Long context in large language models (LLMs) focuses on enhancing the ability of these models to process and reason over significantly extended input sequences, exceeding the limitations of traditional context windows. Current research emphasizes developing novel attention mechanisms (e.g., sparse attention, differential attention) and efficient memory management techniques (e.g., compression, retrieval-augmentation) to overcome computational and memory bottlenecks associated with longer contexts. This area is crucial for advancing LLMs' capabilities in complex tasks requiring holistic understanding of extensive information, such as question answering, summarization, and multi-modal reasoning, impacting both scientific understanding of LLMs and their practical applications.
Papers
Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation
Aadharsh Aadhithya A, Sachin Kumar S, Soman K. P
The Impact of Quantization on Retrieval-Augmented Generation: An Analysis of Small LLMs
Mert Yazan, Suzan Verberne, Frederik Situmeang
RepoQA: Evaluating Long Context Code Understanding
Jiawei Liu, Jia Le Tian, Vijay Daita, Yuxiang Wei, Yifeng Ding, Yuhan Katherine Wang, Jun Yang, Lingming Zhang
Chain of Agents: Large Language Models Collaborating on Long-Context Tasks
Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, Sercan Ö. Arik
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
Alex Jinpeng Wang, Linjie Li, Yiqi Lin, Min Li, Lijuan Wang, Mike Zheng Shou
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Zefan Cai, Yichi Zhang, Bofei Gao, Yuliang Liu, Tianyu Liu, Keming Lu, Wayne Xiong, Yue Dong, Baobao Chang, Junjie Hu, Wen Xiao