Modal Object Tracking

Modal object tracking aims to improve the robustness and reliability of visual object tracking by integrating information from multiple sensory modalities, such as RGB, depth, thermal, and near-infrared images. Current research focuses on developing efficient and effective methods for fusing these diverse data sources, often employing transformer-based architectures or prototype learning approaches to handle modality switches and appearance variations. This field is significant because it addresses limitations of single-modality tracking in challenging environments, paving the way for more reliable and versatile applications in areas like autonomous driving, robotics, and surveillance.

Papers