Autoregressive Language Model

Autoregressive language models (ALMs) are a class of neural networks designed to generate sequential data, primarily text, by predicting the next element in a sequence based on preceding elements. Current research focuses on improving ALM efficiency through techniques like speculative decoding and blockwise parallel decoding, as well as enhancing their capabilities by incorporating visual information and addressing limitations in long-sequence modeling and knowledge distillation. These advancements are significant because they improve the speed and quality of text generation, impacting various applications from machine translation and text-to-speech synthesis to more complex tasks like scene reconstruction and e-commerce applications.

Papers

June 17, 2024

LiLiuM: eBay's Large Language Models for e-commerce
Christian Herold, Michael Kozielski, Leonid Ekimov, Pavel Petrushkov, Pierre-Yves Vandenbussche, Shahram Khadivi
Large Language Model Machine Translation E Commerce Autoregressive Language Model Multilingual Text Word Model

June 9, 2024

Hidden Holes: topological aspects of language models
Stephen Fitz, Peter Romero, Jiyan Jonas Schneider
Large Language Model Language Model Topological Feature Recurrent Network Autoregressive Language Model Product Manifold

June 6, 2024

What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions
Liyi Zhang, Michael Y. Li, Thomas L. Griffiths
Large Language Model Jina Embeddings Autoregressive Language Model Autoregressive Generative Model Latent State Model

June 4, 2024

Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis
Kun Zhou, Shengkui Zhao, Yukun Ma, Chong Zhang, Hao Wang, Dianwen Ng, Chongjia Ni, Nguyen Trung Hieu, Jia Qi Yip, Bin Ma
Autoregressive Language Model Text to Speech Model Text to Speech Synthesis

May 31, 2024

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
Jiatao Gu, Ying Shen, Shuangfei Zhai, Yizhe Zhang, Navdeep Jaitly, Joshua M. Susskind
Diffusion Model Latent Variable Conditional Diffusion Model Autoregressive Language Model Fair Representation Autoregressive Generative Model Discrete Latent Representation

May 18, 2024

LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions
Victor Agostinelli, Sanghyun Hong, Lizhong Chen
Autoregressive Model Autoregressive Language Model Efficient Transformer Dynamic Reweighting Linear Transformer Speech to Text Translation Logical Proportion

May 15, 2024

A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models
Mahsa Khoshnoodi, Vinija Jain, Mingye Gao, Malavika Srikanth, Aman Chadha
Large Language Model Natural Language Processing Text Generation Comprehensive Survey Autoregressive Language Model Fast Generation

May 6, 2024

Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training
Zexuan Zhong, Mengzhou Xia, Danqi Chen, Mike Lewis
Language Model Domain Specific Mixture of Expert Autoregressive Language Model

April 22, 2024

SpaceByte: Towards Deleting Tokenization from Large Language Modeling
Kevin Slagle
Large Language Model Autoregressive Language Model Sub Byte Byte Level

April 17, 2024

LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory
Zicheng Liu, Li Wang, Siyuan Li, Zedong Wang, Haitao Lin, Stan Z. Li
Transformer Model Long Sequence Vector Quantization Autoregressive Language Model Sequence Model Sequence Modeling Task

April 14, 2024

Exploring and Improving Drafts in Blockwise Parallel Decoding
Taehyeon Kim, Ananda Theertha Suresh, Kishore Papineni, Michael Riley, Sanjiv Kumar, Adrian Benton
Language Model Neural Language Model Autoregressive Language Model Inference Speed Token Generation Parallel Decoding K$ Draft

April 1, 2024

Do language models plan ahead for future tokens?
Wilson Wu, John X. Morris, Lionel Levine
Language Model Scientific Inference Autoregressive Language Model Inference Task

March 19, 2024

SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model
Armen Avetisyan, Christopher Xie, Henry Howard-Jenkins, Tsun-Yi Yang, Samir Aroudj, Suvam Patra, Fuyang Zhang, Duncan Frost, Luke Holland, Campbell Orme, Jakob Engel, Edward Miller, Richard Newcombe, Vasileios Balntas
Scene Representation Autoregressive Language Model Theatre Scene Description Large Scale Synthetic Dataset Dynamic Scene Representation

February 28, 2024

Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension
Fan Yin, Jayanth Srinivasa, Kai-Wei Chang
Large Language Model Autoregressive Language Model Intrinsic Dimension Truthful Space Language Model Generation Local Intrinsic

February 22, 2024

CEV-LM: Controlled Edit Vector Language Model for Shaping Natural Language Generations
Samraj Moorjani, Adit Krishnan, Hari Sundaram
Language Model Text Generation Large Scale Language Model Autoregressive Language Model Language Generation Model CTC Based Semantic Content

February 19, 2024

February 12, 2024

November 12, 2023

Learning Globally Optimized Language Structure via Adversarial Training
Xuwang Yin
LeArning Abstract Adversarial Training Autoregressive Language Model Text Generation Capability Language Structure

Autoregressive Language Model

Papers

LiLiuM: eBay's Large Language Models for e-commerce

Hidden Holes: topological aspects of language models

What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions

Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis

Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling

LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions

A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models

Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training

SpaceByte: Towards Deleting Tokenization from Large Language Modeling

LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory

Exploring and Improving Drafts in Blockwise Parallel Decoding

Do language models plan ahead for future tokens?

SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model

Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension

CEV-LM: Controlled Edit Vector Language Model for Shaping Natural Language Generations

Self-AMPLIFY: Improving Small Language Models with Self Post Hoc Explanations

Revisiting Knowledge Distillation for Autoregressive Language Models

Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning

Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models

Learning Globally Optimized Language Structure via Adversarial Training