Data Preparation
Data preparation, the crucial initial step in many machine learning pipelines, focuses on transforming raw data into a suitable format for model training and analysis. Current research emphasizes developing automated and scalable solutions, including toolkits for large language model applications and frameworks for unified data manipulation using LLMs, often incorporating techniques like data augmentation and imputation to address issues like missing values and class imbalance. These advancements aim to improve model accuracy, reproducibility, and efficiency across diverse applications, from medical diagnosis to recommendation systems and natural language processing.
Papers
October 12, 2024
September 26, 2024
September 13, 2024
May 10, 2024
January 25, 2024
January 12, 2024
June 20, 2023
April 26, 2023
April 7, 2023
February 21, 2023
September 16, 2022
June 13, 2022
April 7, 2022
March 2, 2022