Long Input
Processing long input sequences is a major challenge for large language models (LLMs), hindering their application in various domains requiring extensive contextual information, such as long-document summarization and question answering. Current research focuses on developing efficient compression techniques, novel model architectures (e.g., graph-based agents, state-space models), and training strategies to mitigate the computational and memory limitations associated with long inputs. These advancements are crucial for improving the performance and applicability of LLMs in tasks involving extensive textual or multimodal data, ultimately impacting fields ranging from information retrieval to medical diagnosis.
Papers
May 2, 2023
April 13, 2023
March 17, 2023
March 14, 2023
October 21, 2022
October 6, 2022
May 11, 2022
April 27, 2022
February 15, 2022
December 16, 2021
November 20, 2021