3D Dense Captioning
3D dense captioning aims to automatically generate descriptive sentences for individual objects within 3D scenes, requiring both precise object localization and rich contextual understanding. Recent research heavily utilizes transformer-based encoder-decoder architectures, often employing strategies like late aggregation of contextual and instance-specific features or decoupling localization and caption generation into parallel processes to improve accuracy. This task is crucial for advancing 3D scene understanding and has significant implications for applications such as robotics, augmented reality, and accessibility technologies that require detailed scene descriptions.
Papers
August 14, 2024
August 13, 2024
April 17, 2024
April 11, 2024
March 28, 2024
March 12, 2024
January 21, 2024
December 13, 2023
December 5, 2023
September 6, 2023
August 31, 2023
July 24, 2023
June 12, 2023
May 18, 2023
January 6, 2023
December 1, 2022
October 8, 2022
April 22, 2022