Long Input
Processing long input sequences is a major challenge for large language models (LLMs), hindering their application in various domains requiring extensive contextual information, such as long-document summarization and question answering. Current research focuses on developing efficient compression techniques, novel model architectures (e.g., graph-based agents, state-space models), and training strategies to mitigate the computational and memory limitations associated with long inputs. These advancements are crucial for improving the performance and applicability of LLMs in tasks involving extensive textual or multimodal data, ultimately impacting fields ranging from information retrieval to medical diagnosis.
Papers
December 31, 2024
December 24, 2024
December 18, 2024
November 14, 2024
October 18, 2024
October 9, 2024
October 8, 2024
July 31, 2024
July 28, 2024
July 18, 2024
July 2, 2024
June 20, 2024
June 7, 2024
May 28, 2024
April 22, 2024
March 15, 2024
February 19, 2024
January 31, 2024
November 15, 2023