Transformer Based Language Model

Transformer-based language models are deep learning architectures designed to process and generate human language, aiming to understand and replicate the nuances of natural language understanding and generation. Current research focuses on improving model interpretability, addressing contextualization errors, and exploring the internal mechanisms responsible for tasks like reasoning and factual recall, often using models like BERT and GPT variants. These advancements are significant for both the scientific community, furthering our understanding of neural networks and language processing, and for practical applications, enabling improvements in machine translation, question answering, and other NLP tasks.

Papers

December 29, 2022

Maximizing Use-Case Specificity through Precision Model Tuning
Pranjali Awasthi, David Recio-Mitter, Yosuke Kyle Sugi
Language Model Large Corpus Information Retrieval Transformer Based Language Model Larger Language Model Balancing Efficiency Parameter Tuning

December 28, 2022

Cramming: Training a Language Model on a Single GPU in One Day
Jonas Geiping, Tom Goldstein
Language Model Transformer Based Language Model Single GPU

December 23, 2022

Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?
Byung-Doh Oh, William Schuler
Pre Trained Language Model Transformer Based Model Transformer Based Language Model Language Processing Reading Time Surprisal Driven

December 21, 2022

Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question Answering
Akshay Chaturvedi, Swarnadeep Bhar, Soumadeep Saha, Utpal Garain, Nicholas Asher
Language Model Question Answering Transformer Based Language Model Negation Detection Health Intervention Semantic Fidelity Ethical NaTural Language

December 20, 2022

Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion
Yuling Gu
Yes No Question Transformer Based Language Model Experimental Study Implicit Knowledge Language Understanding Task Complement Historical Analysis

December 6, 2022

LUNA: Language Understanding with Number Augmentations on Transformers via Number Plugins and Pre-training
Hongwei Han, Jialiang Xu, Mengyu Zhou, Yijia Shao, Shi Han, Dongmei Zhang
Language Model Transformer Megatron Decepticons Language Understanding Transformer Based Language Model Numerical Data Numerical Reasoning Generic Plugin Large Scale Transformer Model Weak Augmentation

December 1, 2022

November 25, 2022

TRAC: A Textual Benchmark for Reasoning about Actions and Change
Weinan He, Canming Huang, Zhanhao Xiao, Yongmei Liu
Complex Reasoning Transformer Based Language Model Past Action Pre Change Information Text Benchmark Structural Generalization High Performance Transformer

November 10, 2022

Can Transformers Reason in Fragments of Natural Language?
Viktor Schlegel, Kamen V. Pavlov, Ian Pratt-Hartmann
Natural Language Processing Natural Language Transformer Based Language Model Natural Language Text Non Verbal Fragment

November 1, 2022

VarMAE: Pre-training of Variational Masked Autoencoder for Domain-adaptive Language Understanding
Dou Hu, Xiaolong Hou, Xiyang Du, Mengyuan Zhou, Lianxin Jiang, Yang Mo, Xiaofeng Shi
Language Model Transformer Based Language Model Contextual Representation Domain Corpus Masked Autoencoder

October 31, 2022

Tables to LaTeX: structure and content extraction from scientific tables
Pratik Kayal, Mrinal Anand, Harsh Desai, Mayank Singh
Transformer Based Language Model Inner Structure Table Semantics Table Structure PDF Document

October 27, 2022

Fast DistilBERT on CPUs
Haihao Shen, Ofir Zafrir, Bo Dong, Hengyu Meng, Xinyu Ye, Zhe Wang, Yi Ding, Hanwen Chang, Guy Boudoukh, Moshe Wasserblat
Transformer Model Transformer Based Language Model Transformer Inference Fast Transformer

October 26, 2022

End-to-End Multimodal Representation Learning for Video Dialog
Huda Alamri, Anthony Bilic, Michael Hu, Apoorva Beedu, Irfan Essa
Transformer Based Language Model Visual Encoder Multimodal Representation Learning State of the Art Encoders Multimodal Problem Video Dialog Visual Dialog Task

October 22, 2022

October 19, 2022

Language Detoxification with Attribute-Discriminative Latent Space
Jin Myung Kwak, Minseon Kim, Sung Ju Hwang
Language Model Text Generation Transformer Based Language Model Language Model Detoxification

October 16, 2022

Model Criticism for Long-Form Text Generation
Yuntian Deng, Volodymyr Kuleshov, Alexander M. Rush
Language Model Text Generation Transformer Based Language Model Coreference Resolution Model Criticism

October 13, 2022

Can Demographic Factors Improve Text Classification? Revisiting Demographic Adaptation in the Age of Transformers
Chia-Chien Hung, Anne Lauscher, Dirk Hovy, Simone Paolo Ponzetto, Goran Glavaš
Transformer Megatron Decepticons Text Classification Transformer Based Language Model Natural Language Processing Model Speech Based Age Language Representation Language Modelling Demographic Attribute Societal Adaptation

October 6, 2022

To Softmax, or not to Softmax: that is the question when applying Active Learning for Transformer Models
Julius Gonsior, Christian Falkenberg, Silvio Magino, Anja Reusch, Maik Thiele, Wolfgang Lehner
Active Learning Transformer Model Yes No Question Transformer Based Language Model Softmax Function Natural Language Processing Application

Transformer Based Language Model

Papers

Maximizing Use-Case Specificity through Precision Model Tuning

Cramming: Training a Language Model on a Single GPU in One Day

Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?

Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question Answering

Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion

LUNA: Language Understanding with Number Augmentations on Transformers via Number Plugins and Pre-training

Embedding generation for text classification of Brazilian Portuguese user reviews: from bag-of-words to transformers

CultureBERT: Measuring Corporate Culture With Transformer-Based Language Models

TRAC: A Textual Benchmark for Reasoning about Actions and Change

Can Transformers Reason in Fragments of Natural Language?

VarMAE: Pre-training of Variational Masked Autoencoder for Domain-adaptive Language Understanding

Tables to LaTeX: structure and content extraction from scientific tables

Fast DistilBERT on CPUs

End-to-End Multimodal Representation Learning for Video Dialog

Understanding Domain Learning in Language Models Through Subpopulation Analysis

Meta-learning Pathologies from Radiology Reports using Variance Aware Prototypical Networks

Language Detoxification with Attribute-Discriminative Latent Space

Model Criticism for Long-Form Text Generation

Can Demographic Factors Improve Text Classification? Revisiting Demographic Adaptation in the Age of Transformers

To Softmax, or not to Softmax: that is the question when applying Active Learning for Transformer Models