Active Learning
Active learning is a machine learning paradigm focused on optimizing data labeling efficiency by strategically selecting the most informative samples for annotation from a larger unlabeled pool. Current research emphasizes developing novel acquisition functions and data pruning strategies to reduce computational costs associated with large datasets, exploring the integration of active learning with various model architectures (including deep neural networks, Gaussian processes, and language models), and addressing challenges like privacy preservation and handling open-set noise. This approach holds significant promise for reducing the substantial cost and effort of data labeling in diverse fields, ranging from image classification and natural language processing to materials science and healthcare.
Papers
Opinion Spam Detection: A New Approach Using Machine Learning and Network-Based Algorithms
Kiril Danilchenko, Michael Segal, Dan Vilenchik
Deep Active Learning with Noise Stability
Xingjian Li, Pengkun Yang, Yangcheng Gu, Xueying Zhan, Tianyang Wang, Min Xu, Chengzhong Xu
Active Labeling: Streaming Stochastic Gradients
Vivien Cabannes, Francis Bach, Vianney Perchet, Alessandro Rudi
Active Learning Through a Covering Lens
Ofer Yehuda, Avihu Dekel, Guy Hacohen, Daphna Weinshall
PyRelationAL: a python library for active learning research and development
Paul Scherer, Alison Pouplin, Alice Del Vecchio, Suraj M S, Oliver Bolton, Jyothish Soman, Jake P. Taylor-King, Lindsay Edwards, Thomas Gaudelet