Multi Modal Knowledge
Multi-modal knowledge focuses on integrating information from diverse sources like text, images, and audio to create richer, more comprehensive representations of data. Current research emphasizes efficient methods for fusing these modalities, often employing transformer-based architectures and adapter-style transfer learning to leverage pre-trained models while minimizing computational cost and mitigating issues like catastrophic forgetting and missing data. This field is crucial for advancing applications such as affective computing, visual question answering, and knowledge graph alignment, enabling more robust and human-like interactions with technology.
Papers
November 5, 2024
October 10, 2024
March 17, 2024
March 16, 2024
September 24, 2023
May 12, 2023
September 2, 2022