Feature Engineering
Feature engineering, the process of selecting, transforming, and creating relevant features from raw data to improve machine learning model performance, remains a crucial area of research. Current efforts focus on automating this process, employing techniques like reinforcement learning, large language models (LLMs), and information-theoretic approaches to generate and select optimal features, often incorporating knowledge graphs or other domain expertise. These advancements aim to reduce the time and expertise required for manual feature engineering, leading to more efficient and effective machine learning pipelines across diverse applications, from financial forecasting to anomaly detection. The resulting improvements in model accuracy and interpretability have significant implications for various scientific fields and practical applications.
Papers
ELF-Gym: Evaluating Large Language Models Generated Features for Tabular Prediction
Yanlin Zhang, Ning Li, Quan Gan, Weinan Zhang, David Wipf, Minjie Wang
Statistical Test for Auto Feature Engineering by Selective Inference
Tatsuya Matsukawa, Tomohiro Shiraishi, Shuichi Nishino, Teruyuki Katsuoka, Ichiro Takeuchi