Paper ID: 2404.14780

ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions

Shounak Sural, Nishad Sahu, Ragunathan Rajkumar

The fusion of multimodal sensor data streams such as camera images and lidar point clouds plays an important role in the operation of autonomous vehicles (AVs). Robust perception across a range of adverse weather and lighting conditions is specifically required for AVs to be deployed widely. While multi-sensor fusion networks have been previously developed for perception in sunny and clear weather conditions, these methods show a significant degradation in performance under night-time and poor weather conditions. In this paper, we propose a simple yet effective technique called ContextualFusion to incorporate the domain knowledge about cameras and lidars behaving differently across lighting and weather variations into 3D object detection models. Specifically, we design a Gated Convolutional Fusion (GatedConv) approach for the fusion of sensor streams based on the operational context. To aid in our evaluation, we use the open-source simulator CARLA to create a multimodal adverse-condition dataset called AdverseOp3D to address the shortcomings of existing datasets being biased towards daytime and good-weather conditions. Our ContextualFusion approach yields an mAP improvement of 6.2% over state-of-the-art methods on our context-balanced synthetic dataset. Finally, our method enhances state-of-the-art 3D objection performance at night on the real-world NuScenes dataset with a significant mAP improvement of 11.7%.

Submitted: Apr 23, 2024