Multimodal Recommendation

Multimodal recommendation systems aim to improve recommendation accuracy by integrating diverse data modalities, such as text, images, and audio, alongside traditional user-item interaction data. Current research focuses on developing effective fusion techniques within various model architectures, including those leveraging large language and multimodal models, to address challenges like cold-start problems and modality imbalance. These advancements hold significant potential for enhancing personalized recommendations across various applications, from e-commerce and entertainment to news and information retrieval, by providing richer and more contextually relevant suggestions.

Papers