Multimodal Data Augmentation
Multimodal data augmentation aims to improve the performance of machine learning models that process multiple data types (e.g., images and text) by artificially expanding training datasets. Current research focuses on developing methods that generate realistic and semantically consistent augmented data, often leveraging techniques like mixing, attribute manipulation guided by knowledge bases, and feature-space transformations. These advancements are significant because they address the limitations of existing multimodal datasets, leading to improved model accuracy and generalization across various applications, including 3D object recognition, image captioning, and visual question answering.
Papers
August 19, 2024
May 28, 2024
December 6, 2023
May 3, 2023
April 3, 2023
March 8, 2023
December 29, 2022
November 22, 2022
October 18, 2022
August 3, 2022
June 16, 2022
May 30, 2022
April 1, 2022
December 9, 2021