Encoded Concept

Encoded concepts research aims to understand how human-interpretable concepts are represented within the latent spaces of complex machine learning models, primarily to improve model interpretability and trustworthiness. Current efforts focus on developing methods to identify and manipulate these encoded concepts using various techniques, including probabilistic encoding with energy-based models, binarized regularization in generative models, and subspace analysis in transformer networks. This research is crucial for building more reliable and explainable AI systems, fostering greater understanding of model behavior and potentially leading to improved model design and more effective human-computer interaction.

Papers

December 16, 2024

Emergence of Abstractions: Concept Encoding and Decoding Mechanism for In-Context Learning in Transformers
Seungwook Han, Jinyeop Song, Jeff Gore, Pulkit Agrawal
Transformer Megatron Decepticons Context Learning Adaptive Learning Autoregressive Transformer Metric Aware Abstraction Latent Concept Encoded Concept

September 22, 2024

EQ-CBM: A Probabilistic Concept Bottleneck with Energy-based Models and Quantized Vectors
Sangwon Kim, Dasom Ahn, Byoung Chul Ko, In-su Jang, Kwang-Ju Kim
Vector Quantization Energy Based Model Concept Bottleneck Model Interpretable Deep Learning Concept Activation Vector Encoded Concept

October 3, 2023

Soda: An Object-Oriented Functional Language for Specifying Human-Centered Problems
Julian Alfredo Mendez
Error Detection Human Centric Textual Representation Functional Programming Symbolic Task Encoded Concept

March 22, 2023

Encoding Binary Concepts in the Latent Space of Generative Models for Enhancing Data Representation
Zizhao Hu, Mohammad Rostami
Neural Network Generative Model Latent Space Supervised Autoencoder Variational Autoencoder Transferable Representation Binary Analysis Encoded Concept

February 25, 2023

February 7, 2023

Concept Algebra for (Score-Based) Text-Controlled Generative Models
Zihao Wang, Lin Gui, Jeffrey Negrea, Victor Veitch
Generative Model Internal Representation Representation Space Encoded Concept Algebraic Specification

July 27, 2022

Encoding Concepts in Graph Neural Networks
Lucie Charlotte Magister, Pietro Barbiero, Dmitry Kazhdan, Federico Siciliano, Gabriele Ciravegna, Fabrizio Silvestri, Mateja Jamnik, Pietro Lio
Graph Neural Network Concept Based Concept Discovery Encoded Concept

June 27, 2022

Analyzing Encoded Concepts in Transformer Language Models
Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Firoj Alam, Abdul Rafae Khan, Jia Xu
Pre Trained Language Model Concept Identification Transformer Language Model Latent Concept Encoded Concept

January 28, 2022

Kernelized Concept Erasure
Shauli Ravfogel, Francisco Vargas, Yoav Goldberg, Ryan Cotterell
Neural Representation Multi Step Adversarial Attack Concept Erasure Encoded Concept

Encoded Concept

Papers

Emergence of Abstractions: Concept Encoding and Decoding Mechanism for In-Context Learning in Transformers

EQ-CBM: A Probabilistic Concept Bottleneck with Energy-based Models and Quantized Vectors

Soda: An Object-Oriented Functional Language for Specifying Human-Centered Problems

Encoding Binary Concepts in the Latent Space of Generative Models for Enhancing Data Representation

Bayesian Neural Networks Avoid Encoding Complex and Perturbation-Sensitive Concepts

Does a Neural Network Really Encode Symbolic Concepts?

Concept Algebra for (Score-Based) Text-Controlled Generative Models

Encoding Concepts in Graph Neural Networks

Analyzing Encoded Concepts in Transformer Language Models

Kernelized Concept Erasure