Paper ID: 2303.06721

Knowledge-integrated AutoEncoder Model

Teddy Lazebnik, Liron Simon-Keren

Data encoding is a common and central operation in most data analysis tasks. The performance of other models, downstream in the computational process, highly depends on the quality of data encoding. One of the most powerful ways to encode data is using the neural network AutoEncoder (AE) architecture. However, the developers of AE are not able to easily influence the produced embedding space, as it is usually treated as a \textit{black box} technique, which makes it uncontrollable and not necessarily has desired properties for downstream tasks. In this paper, we introduce a novel approach for developing AE models that can integrate external knowledge sources into the learning process, possibly leading to more accurate results. The proposed \methodNamefull{} (\methodName{}) model is able to leverage domain-specific information to make sure the desired distance and neighborhood properties between samples are preservative in the embedding space. The proposed model is evaluated on three large-scale datasets from three different scientific fields and is compared to nine existing encoding models. The results demonstrate that the \methodName{} model effectively captures the underlying structures and relationships between the input data and external knowledge, meaning it generates a more useful representation. This leads to outperforming the rest of the models in terms of reconstruction accuracy.

Submitted: Mar 12, 2023