Dynamical Variational

Dynamical variational autoencoders (DVAEs) are a class of deep generative models designed to learn the underlying dynamics of sequential data, such as human motion, music, or sensor readings from autonomous vehicles. Current research focuses on improving DVAEs' ability to handle complex, multi-source data through architectures like mixtures of DVAEs and incorporating advanced components such as transformers and recurrent neural networks for better temporal modeling. These models find applications in diverse fields, including robotics, music generation, computer vision (e.g., object tracking), and speech processing, offering improved performance in tasks like motion denoising, multi-object tracking, and speech enhancement compared to traditional methods.

Papers