N Gram
N-grams, contiguous sequences of words or characters, are fundamental units in language modeling, primarily used to predict the probability of a word given its preceding context. Current research focuses on improving the efficiency and effectiveness of n-gram-based methods within larger language models (LLMs), particularly by incorporating them into novel training objectives (like next distribution prediction) or leveraging them for faster inference (e.g., through speculative decoding and token recycling). This work is significant because it addresses limitations of LLMs, such as over-reliance on local dependencies and high computational costs, leading to improvements in model performance, training efficiency, and practical applications like speech-to-text and machine translation.