Paper ID: 2410.05860

MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning

Sofya Dymchenko (DATAMOVE), Abhishek Purandare (DATAMOVE), Bruno Raffin (DATAMOVE)

Artificial intelligence is transforming scientific computing with deep neural network surrogates that approximate solutions to partial differential equations (PDEs). Traditional off-line training methods face issues with storage and I/O efficiency, as the training dataset has to be computed with numerical solvers up-front. Our previous work, the Melissa framework, addresses these problems by enabling data to be created "on-the-fly" and streamed directly into the training process. In this paper we introduce a new active learning method to enhance data-efficiency for on-line surrogate training. The surrogate is direct and multi-parametric, i.e., it is trained to predict a given timestep directly with different initial and boundary conditions parameters. Our approach uses Adaptive Multiple Importance Sampling guided by training loss statistics, in order to focus NN training on the difficult areas of the parameter space. Preliminary results for 2D heat PDE demonstrate the potential of this method, called Breed, to improve the generalization capabilities of surrogates while reducing computational overhead.

Submitted: Oct 8, 2024