Prompt Compression
Prompt compression aims to reduce the length of input prompts for large language models (LLMs) to improve computational efficiency and reduce costs without sacrificing performance. Research focuses on developing methods that selectively retain crucial information, employing techniques like extractive compression, summarization, and reinforcement learning to optimize compression ratios while preserving semantic meaning. These advancements are significant because they address the growing challenge of LLM resource consumption, enabling faster and more cost-effective deployment of these powerful models across various applications.
Papers
February 28, 2024
February 25, 2024
October 10, 2023
October 9, 2023
August 17, 2023
May 4, 2023
April 17, 2023
October 6, 2022