Data Engineering

Data engineering focuses on building and maintaining the infrastructure necessary to collect, process, and manage large datasets for use in applications like machine learning and business intelligence. Current research emphasizes efficient pipeline tools for data ingestion, transformation, and orchestration, particularly for scaling models to handle extremely large contexts and addressing data quality issues in recurring pipelines. This field is crucial for enabling the practical application of advanced AI systems, impacting both the development of trustworthy AI and the productivity of data scientists and analysts.

Papers