MAESTRO Dataset
The MAESTRO dataset, while not explicitly defined in the provided abstracts, appears to be a collection of diverse datasets used to benchmark and evaluate various machine learning models, primarily focusing on multimodal tasks and addressing challenges in data quality, label accuracy, and model generalization. Current research leverages large language models (LLMs), transformer architectures, and deep learning techniques like nnUNet and diffusion models to improve performance across diverse applications, including medical image analysis, content moderation, and natural language processing. The availability of these datasets and the associated research significantly advances the field by providing standardized benchmarks for evaluating model performance and facilitating the development of more robust and reliable AI systems.
Papers
Label Alignment and Reassignment with Generalist Large Language Model for Enhanced Cross-Domain Named Entity Recognition
Ke Bao, Chonghuan Yang
Enhancing Environmental Monitoring through Multispectral Imaging: The WasteMS Dataset for Semantic Segmentation of Lakeside Waste
Qinfeng Zhu, Ningxin Weng, Lei Fan, Yuanzhi Cai
DurLAR: A High-fidelity 128-channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery for Multi-modal Autonomous Driving Applications
Li Li, Khalid N. Ismail, Hubert P. H. Shum, Toby P. Breckon
LUMA: A Benchmark Dataset for Learning from Uncertain and Multimodal Data
Grigor Bezirganyan, Sana Sellami, Laure Berti-Équille, Sébastien Fournier