Transformer Based Language Model

Transformer-based language models are deep learning architectures designed to process and generate human language, aiming to understand and replicate the nuances of natural language understanding and generation. Current research focuses on improving model interpretability, addressing contextualization errors, and exploring the internal mechanisms responsible for tasks like reasoning and factual recall, often using models like BERT and GPT variants. These advancements are significant for both the scientific community, furthering our understanding of neural networks and language processing, and for practical applications, enabling improvements in machine translation, question answering, and other NLP tasks.

Papers

June 16, 2024

Leading Whitespaces of Language Models' Subword Vocabulary Poses a Confound for Calculating Word Probabilities
Byung-Doh Oh, William Schuler
Language Model Transformer Based Language Model Sub Word Confounding Bias Word Boundary

May 22, 2024

DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning
Zijian Zhou, Xiaoqiang Lin, Xinyi Xu, Alok Prakash, Daniela Rus, Bryan Kian Hsiang Low
Context Learning Transformer Based Language Model DetAIL Attribution Score Task Demonstration

May 15, 2024

ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models
Siwei Wang, Yifei Shen, Shi Feng, Haoran Sun, Shang-Hua Teng, Wei Chen
Language Model Transformer Based Language Model Autoregressive Neural Network Planning Benchmark Planning Capability

May 14, 2024

Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory
Xueyan Niu, Bo Bai, Lei Deng, Wei Han
Transformer Model Transformer Based Language Model Scaling Law Associative Memory Hopfield Network Transformer Performance

May 12, 2024

ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis
Mohammad Amaz Uddin, Muhammad Nazrul Islam, Leandros Maglaras, Helge Janicke, Iqbal H. Sarker
Transformer Based Language Model Explanation Model Sm Spam Detection

May 7, 2024

A Transformer with Stack Attention
Jiaoda Li, Jennifer C. White, Mrinmaya Sachan, Ryan Cotterell
Language Model Transformer Based Transformer Based Language Model

April 30, 2024

A Primer on the Inner Workings of Transformer-based Language Models
Javier Ferrando, Gabriele Sarti, Arianna Bisazza, Marta R. Costa-jussà
Transformer Based Language Model Brief Introduction Advanced Language Model Different Approach

April 25, 2024

Exploring Internal Numeracy in Language Models: A Case Study on ALBERT
Ulme Wennberg, Gustav Eje Henter
Language Model Case Study Jina Embeddings Transformer Based Language Model Numerical Reasoning Model Generated Internal Numeracy

April 18, 2024

Length Generalization of Causal Transformers without Position Encoding
Jie Wang, Tao Ji, Yuanbin Wu, Hang Yan, Tao Gui, Qi Zhang, Xuanjing Huang, Xiaoling Wang
Transformer Based Language Model Length Generalization Causal Transformer Long Context Task Position Encoding

April 10, 2024

LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models
Igor Tufanov, Karen Hambardzumyan, Javier Ferrando, Elena Voita
Sequence to Sequence Transformer Based Language Model Transparency Index Information Flow Interpretability Analysis

April 7, 2024

A Multi-Level Framework for Accelerating Training Transformer Models
Longwei Zou, Han Zhang, Yangdong Deng
Transformer Based Language Model Hierarchical Framework Transformer Training BERT Large Layer Similarity

April 2, 2024

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
David Raposo, Sam Ritter, Blake Richards, Timothy Lillicrap, Peter Conway Humphreys, Adam Santoro
Transformer Based Language Model Input Sequence Depth Anything Compound Token

April 1, 2024

Stream of Search (SoS): Learning to Search in Language
Kanishk Gandhi, Denise Lee, Gabriel Grand, Muxin Liu, Winson Cheng, Archit Sharma, Noah D. Goodman
Language Model Human Language Search Query Transformer Based Language Model Link Stream Symbolic Computation Heuristic Solver

March 28, 2024

March 25, 2024

Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT
Rohit Raju, Peeta Basa Pati, SA Gandheesh, Gayatri Sanjana Sannala, Suriya KS
Language Model Text Modality Text Generation Transformer Based Language Model Gallery Style OCR Spelling Correction Expert Curated GraMMaR Response Time Text Document Generative Transformer Model BART

March 12, 2024

March 4, 2024

Topic Aware Probing: From Sentence Length Prediction to Idiom Identification how reliant are Neural Language Models on Topic?
Vasudevan Nedumpozhimana, John D. Kelleher
Transformer Based Model Transformer Based Language Model Neural Language Model Topic Analysis Semantic Task Length Extrapolation

February 28, 2024

NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar, Tejaswini Pedapati, Ronny Luss, Soham Dan, Aurelie Lozano, Payel Das, Georgios Kollias
Natural Language Inference Transformer Based Language Model Topology Optimization Neuronal Network Neuron Pruning

Transformer Based Language Model

Papers

Leading Whitespaces of Language Models' Subword Vocabulary Poses a Confound for Calculating Word Probabilities

DETAIL: Task DEmonsTration Attribution for Interpretable In-context Learning

ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models

Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory

ExplainableDetector: Exploring Transformer-based Language Modeling Approach for SMS Spam Detection with Explainability Analysis

A Transformer with Stack Attention

A Primer on the Inner Workings of Transformer-based Language Models

Exploring Internal Numeracy in Language Models: A Case Study on ALBERT

Length Generalization of Causal Transformers without Position Encoding

LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models

A Multi-Level Framework for Accelerating Training Transformer Models

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Stream of Search (SoS): Learning to Search in Language

Interpreting Key Mechanisms of Factual Recall in Transformer-Based Language Models

A Benchmark Evaluation of Clinical Named Entity Recognition in French

Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT

Mechanics of Next Token Prediction with Self-Attention

Chronos: Learning the Language of Time Series

Topic Aware Probing: From Sentence Length Prediction to Idiom Identification how reliant are Neural Language Models on Topic?

NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models