Based Sampling
Based sampling techniques aim to improve the efficiency and effectiveness of data utilization in various machine learning tasks by strategically selecting subsets of data for model training or analysis. Current research focuses on developing sophisticated sampling strategies, often incorporating clustering algorithms to account for data heterogeneity and imbalance, leading to improved model performance and reduced computational costs in applications such as large language model training, federated learning, and sequential recommendation. These advancements are significant because they address limitations of simpler random sampling methods, ultimately leading to more accurate and robust models across diverse domains.
Papers
February 22, 2024
January 11, 2024
September 30, 2023
August 22, 2023
March 1, 2023
November 22, 2022
August 31, 2022
July 6, 2022