Referring Expression Grounding
Referring Expression Grounding (REG) focuses on enabling machines to accurately identify objects in images or 3D scenes based on natural language descriptions. Current research emphasizes improving the robustness and accuracy of REG across diverse domains and data limitations, exploring techniques like multi-modal domain adaptation, reinforcement learning with human feedback, and novel neural network architectures such as transformers and attention mechanisms to better integrate visual and linguistic information. These advancements are crucial for bridging the gap between human-computer interaction and enabling more sophisticated applications in areas like robotics, augmented reality, and image retrieval.
Papers
October 26, 2023
September 23, 2023
May 22, 2023
April 4, 2023
July 18, 2022
July 5, 2022
January 18, 2022