Multimodal Recommender System

Multimodal recommender systems aim to improve personalized recommendations by integrating diverse data types, such as text, images, and audio, to create richer user and item representations. Current research emphasizes leveraging large language and multimodal models, often incorporating techniques like prompt engineering, modality fusion (early and late), and graph neural networks to address challenges like data sparsity and cold-start problems. This field is significant because it promises more accurate and relevant recommendations across various applications, from e-commerce and entertainment to even healthcare, where it can aid in predicting disease comorbidity.

Papers