Data Analysis Pipeline

A data analysis pipeline is a structured sequence of computational steps transforming raw data into actionable insights. Current research emphasizes automating pipeline construction and execution, often leveraging large language models (LLMs) to generate code, visualizations, and interpret results, as well as improving statistical rigor through techniques like selective inference. This work aims to enhance reproducibility, efficiency, and accessibility of data analysis across diverse domains, from business intelligence to scientific workflows and even sensitive applications like medical image analysis and child safety.

Papers