Novel Dataset
Recent research focuses on creating novel datasets for diverse applications, addressing limitations in existing resources and enabling advancements in various fields. These datasets span various modalities, including text, images, video, and sensor data, and are designed for tasks such as code generation, object detection, natural language processing, and multi-agent reinforcement learning. Common model architectures employed include transformers, convolutional neural networks, and ensemble methods, often benchmarked against established baselines. The development of these high-quality datasets is crucial for improving the accuracy and reliability of machine learning models across a wide range of scientific and practical applications.
Papers
EUFCC-340K: A Faceted Hierarchical Dataset for Metadata Annotation in GLAM Collections
Francesc Net, Marc Folia, Pep Casals, Andrew D. Bagdanov, Lluis Gomez
A multilingual dataset for offensive language and hate speech detection for hausa, yoruba and igbo languages
Saminu Mohammad Aliyu, Gregory Maksha Wajiga, Muhammad Murtala