Quality Issue
Research on quality issues spans diverse fields, focusing on developing methods to objectively assess and improve the quality of data, models, and processes. Current efforts concentrate on refining evaluation metrics, leveraging machine learning models (like transformers and diffusion models) for quality prediction and enhancement, and designing algorithms to optimize for quality while managing computational constraints. These advancements are crucial for improving the reliability and trustworthiness of AI systems across various applications, from medical diagnosis and financial reporting to language processing and image analysis, ultimately leading to more robust and impactful technologies.
Papers
SpeechQE: Estimating the Quality of Direct Speech Translation
HyoJung Han, Kevin Duh, Marine Carpuat
Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training
Michael Pieler, Marco Bellagente, Hannah Teufel, Duy Phung, Nathan Cooper, Jonathan Tow, Paulo Rocha, Reshinth Adithyan, Zaid Alyafeai, Nikhil Pinnaparaju, Maksym Zhuravinskyi, Carlos Riquelme
Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts
German Gritsai, Anastasia Voznyuk, Andrey Grabovoy, Yury Chekhovich
MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning Steps
Xiongtao Zhou, Jie He, Lanyu Chen, Jingyu Li, Haojing Chen, Víctor Gutiérrez-Basulto, Jeff Z. Pan, Hanjie Chen
Qtok: A Comprehensive Framework for Evaluating Multilingual Tokenizer Quality in Large Language Models
Iaroslav Chelombitko, Egor Safronov, Aleksey Komissarov
The Best of Both Worlds: Bridging Quality and Diversity in Data Selection with Bipartite Graph
Minghao Wu, Thuy-Trang Vu, Lizhen Qu, Gholamreza Haffari