MAESTRO Dataset
The MAESTRO dataset, while not explicitly defined in the provided abstracts, appears to be a collection of diverse datasets used to benchmark and evaluate various machine learning models, primarily focusing on multimodal tasks and addressing challenges in data quality, label accuracy, and model generalization. Current research leverages large language models (LLMs), transformer architectures, and deep learning techniques like nnUNet and diffusion models to improve performance across diverse applications, including medical image analysis, content moderation, and natural language processing. The availability of these datasets and the associated research significantly advances the field by providing standardized benchmarks for evaluating model performance and facilitating the development of more robust and reliable AI systems.
Papers
IDTrust: Deep Identity Document Quality Detection with Bandpass Filtering
Musab Al-Ghadi, Joris Voerman, Souhail Bakkali, Mickaël Coustaty, Nicolas Sidere, Xavier St-Georges
Cross-Lingual Learning vs. Low-Resource Fine-Tuning: A Case Study with Fact-Checking in Turkish
Recep Firat Cekinel, Pinar Karagoz, Cagri Coltekin
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages
Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani, Meriem Beloucif, Chris Biemann, Sofia Bourhim, Christine De Kock, Genet Shanko Dekebo, Oumaima Hourrane, Gopichand Kanumolu, Lokesh Madasu, Samuel Rutunda, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Hailegnaw Getaneh Tilaye, Krishnapriya Vishnubhotla, Genta Winata, Seid Muhie Yimam, Saif M. Mohammad
BdSLW60: A Word-Level Bangla Sign Language Dataset
Husne Ara Rubaiyeat, Hasan Mahmud, Ahsan Habib, Md. Kamrul Hasan