Multi Modal Object Tracking
Multi-modal object tracking (MMOT) aims to robustly locate objects in video sequences by integrating information from multiple sensor sources, such as visual, depth, thermal, and even language data. Current research emphasizes developing efficient and generalizable models, often employing transformer-based architectures or adapting pre-trained models via techniques like prompt tuning and self-distillation to handle diverse modalities and improve performance in challenging conditions. This field is crucial for advancing applications like autonomous driving and surveillance, where reliable object tracking across various sensor limitations is paramount.
Papers
May 23, 2024
March 24, 2024
March 23, 2024