Arabic Multimodal

Arabic multimodal research focuses on developing and applying artificial intelligence models that understand and generate content combining text and images, particularly within the context of Arabic language and its diverse dialects. Current efforts concentrate on creating large language models (LLMs) capable of handling multimodal tasks like image captioning and sentiment analysis, often leveraging transformer architectures and techniques like word alignment. This work is crucial due to the scarcity of Arabic multimodal datasets, hindering progress in areas such as propaganda detection and improving the accessibility of AI technologies across different languages and cultures.

Papers

July 25, 2024

Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic
Fakhraddin Alwajih, Gagan Bhatia, Muhammad Abdul-Mageed
Language Model Multimodal Large Language Model Arabic Speaker Advanced Language Model Arabic Multimodal

June 6, 2024

ArMeme: Propagandistic Content in Arabic Memes
Firoj Alam, Abul Hasnat, Fatema Ahmed, Md Arid Hasan, Maram Hasanain
Arabic Multimodal

June 10, 2023

Towards Arabic Multimodal Dataset for Sentiment Analysis
Abdelhamid Haouhat, Slimane Bellaouar, Attia Nehar, Hadda Cherroun
Deep Learning Sentiment Analysis Multimodal Sentiment Analysis Arabic Multimodal

Arabic Multimodal

Papers

Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic

ArMeme: Propagandistic Content in Arabic Memes

Towards Arabic Multimodal Dataset for Sentiment Analysis