Context Bigram

Context bigrams, pairs of consecutive words in text, are crucial for understanding how language models process and generate text. Current research focuses on leveraging bigram statistics to improve model performance, particularly through methods like contrastive learning and the development of "statistical induction heads" within transformer architectures. This work aims to enhance the quality of training data, improve the accuracy of semantic textual relatedness tasks across multiple languages, and ultimately lead to more reliable and efficient large language models.

Papers

October 30, 2024

Learning and Transferring Sparse Contextual Bigrams with Linear Transformers
Yunwei Ren, Zixuan Wang, Jason D. Lee
LeArning Abstract Transformer Megatron Decepticons Layer Transformer Linear Ordered Data Natural Language Model Context Bigram

October 23, 2024

ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams
Srija Anand, Praveen Srinivasa Varadhan, Mehak Singal, Mitesh M. Khapra
Speech Recognition Text to Speech Synthesized Speech Natural Sounding Speech Speech Enhancement Model Low Resource Text to Speech Context Bigram

September 9, 2024

Improving Pretraining Data Using Perplexity Correlations
Tristan Thrush, Christopher Potts, Tatsunori Hashimoto
Pre Training Data Selection Perplexity Analysis Instance Correlation Context Bigram

April 6, 2024

IITK at SemEval-2024 Task 1: Contrastive Learning and Autoencoders for Semantic Textual Relatedness in Multilingual Texts
Udvas Basak, Rajarshi Dutta, Shivam Pandey, Ashutosh Modi
Contrastive Learning Supervised Autoencoder SemEval 2022 Task Multilingual Text Semantic Textual Relatedness Context Bigram

February 16, 2024

The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Benjamin L. Edelman, Ezra Edelman, Surbhi Goel, Eran Malach, Nikolaos Tsilivis
Large Language Model Context Learning Specie Evolution Next Token Prediction Classification Head Token Correlation Context Bigram

January 19, 2024

Quantifying Similarity: Text-Mining Approaches to Evaluate ChatGPT and Google Bard Content in Relation to BioMedical Literature
Jakub Klimczak, Ahmed Abdeen Hamed
ChatGPT Generated Conversation Constructive Approach Tf Idf BioMedical Literature Document Similarity Google Bard Context Bigram

June 1, 2023

Birth of a Transformer: A Memory Viewpoint
Alberto Bietti, Vivien Cabannes, Diane Bouchacourt, Herve Jegou, Leon Bottou
Large Language Model Transformer Based Associative Memory Context Bigram