Greedy Decoding

Greedy decoding is a fast, deterministic approach to generating text from large language models (LLMs), but its limitations in output quality have spurred research into improved methods. Current efforts focus on enhancing greedy decoding's performance through techniques like minimum Bayes risk decoding, which leverages external evaluators to select superior outputs, and self-ensembling methods that aggregate results from diverse prompts or multiple model runs. These advancements aim to improve the accuracy and efficiency of LLM text generation, impacting various applications from machine translation to speech recognition.

Papers