Paper ID: 2401.16497

A Bayesian Gaussian Process-Based Latent Discriminative Generative Decoder (LDGD) Model for High-Dimensional Data

Navid Ziaei, Behzad Nazari, Uri T. Eden, Alik Widge, Ali Yousefi

Extracting meaningful information from high-dimensional data poses a formidable modeling challenge, particularly when the data is obscured by noise or represented through different modalities. This research proposes a novel non-parametric modeling approach, leveraging the Gaussian process (GP), to characterize high-dimensional data by mapping it to a latent low-dimensional manifold. This model, named the latent discriminative generative decoder (LDGD), employs both the data and associated labels in the manifold discovery process. We derive a Bayesian solution to infer the latent variables, allowing LDGD to effectively capture inherent stochasticity in the data. We demonstrate applications of LDGD on both synthetic and benchmark datasets. Not only does LDGD infer the manifold accurately, but its accuracy in predicting data points' labels surpasses state-of-the-art approaches. In the development of LDGD, we have incorporated inducing points to reduce the computational complexity of Gaussian processes for large datasets, enabling batch training for enhanced efficient processing and scalability. Additionally, we show that LDGD can robustly infer manifold and precisely predict labels for scenarios in that data size is limited, demonstrating its capability to efficiently characterize high-dimensional data with limited samples. These collective attributes highlight the importance of developing non-parametric modeling approaches to analyze high-dimensional data.

Submitted: Jan 29, 2024