Sparse Auto Encoders
Sparse autoencoders (SAEs) are neural networks designed to learn compressed, low-dimensional representations of data by reconstructing their inputs from a sparse encoding. Current research focuses on improving SAE robustness to noisy inputs, enhancing their interpretability by analyzing the learned features (e.g., in transformer models), and developing efficient training methods, including the use of novel optimization algorithms and architectural modifications like stacked ensembles. These advancements are improving the performance of SAEs in various applications, such as image compression, information retrieval, and data classification, by enabling more efficient feature extraction and dimensionality reduction.
Papers
November 15, 2024
October 15, 2024
June 25, 2024
June 23, 2024
October 25, 2022
September 9, 2022
April 14, 2022