Transformer Based Language Model
Transformer-based language models are deep learning architectures designed to process and generate human language, aiming to understand and replicate the nuances of natural language understanding and generation. Current research focuses on improving model interpretability, addressing contextualization errors, and exploring the internal mechanisms responsible for tasks like reasoning and factual recall, often using models like BERT and GPT variants. These advancements are significant for both the scientific community, furthering our understanding of neural networks and language processing, and for practical applications, enabling improvements in machine translation, question answering, and other NLP tasks.
Papers
VISIT: Visualizing and Interpreting the Semantic Information Flow of Transformers
Shahar Katz, Yonatan Belinkov
Teaching Probabilistic Logical Reasoning to Transformers
Aliakbar Nafar, Kristen Brent Venable, Parisa Kordjamshidi
Explaining Emergent In-Context Learning as Kernel Regression
Chi Han, Ziqi Wang, Han Zhao, Heng Ji
Generic Dependency Modeling for Multi-Party Conversation
Weizhou Shen, Xiaojun Quan, Ke Yang
Mask-guided BERT for Few Shot Text Classification
Wenxiong Liao, Zhengliang Liu, Haixing Dai, Zihao Wu, Yiyang Zhang, Xiaoke Huang, Yuzhong Chen, Xi Jiang, Wei Liu, Dajiang Zhu, Tianming Liu, Sheng Li, Xiang Li, Hongmin Cai