Truncation Sampling

Truncation sampling is a technique used to improve the quality and diversity of text generated by large language models (LLMs) and enhance the performance of diffusion-based generative models. Current research focuses on developing adaptive truncation methods, such as min-p sampling and η-sampling, that dynamically adjust probability thresholds to balance coherence and creativity, addressing limitations of simpler methods like top-k and top-p. These advancements aim to mitigate issues like exposure bias and neural text degeneration, leading to more fluent and diverse outputs with improved applications in tasks ranging from text generation to spam classification and image generation.

Papers