Data Recycling
Data recycling, or the reuse of existing data for improved model training or efficiency, is a burgeoning field aiming to enhance machine learning performance and reduce computational costs. Current research focuses on leveraging past data for various purposes, including improving large language model (LLM) controllability and instruction following, boosting the performance of synthetic data generation for training classifiers, and optimizing algorithms like Word2Vec for faster execution. These techniques show promise in improving model accuracy, privacy, and efficiency across diverse applications, from software development and medical image analysis to electronic waste recycling and online advertising.
Papers
On quantum backpropagation, information reuse, and cheating measurement collapse
Amira Abbas, Robbie King, Hsin-Yuan Huang, William J. Huggins, Ramis Movassagh, Dar Gilboa, Jarrod R. McClean
Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach
Zhangming Chan, Yu Zhang, Shuguang Han, Yong Bai, Xiang-Rong Sheng, Siyuan Lou, Jiacen Hu, Baolin Liu, Yuning Jiang, Jian Xu, Bo Zheng