Paper ID: 2205.01677

ASTROMER: A transformer-based embedding for the representation of light curves

C. Donoso-Oliva, I. Becker, P. Protopapas, G. Cabrera-Vives, Vishnu M., Harsh Vardhan

Taking inspiration from natural language embeddings, we present ASTROMER, a transformer-based model to create representations of light curves. ASTROMER was pre-trained in a self-supervised manner, requiring no human-labeled data. We used millions of R-band light sequences to adjust the ASTROMER weights. The learned representation can be easily adapted to other surveys by re-training ASTROMER on new sources. The power of ASTROMER consists of using the representation to extract light curve embeddings that can enhance the training of other models, such as classifiers or regressors. As an example, we used ASTROMER embeddings to train two neural-based classifiers that use labeled variable stars from MACHO, OGLE-III, and ATLAS. In all experiments, ASTROMER-based classifiers outperformed a baseline recurrent neural network trained on light curves directly when limited labeled data was available. Furthermore, using ASTROMER embeddings decreases computational resources needed while achieving state-of-the-art results. Finally, we provide a Python library that includes all the functionalities employed in this work. The library, main code, and pre-trained weights are available at https://github.com/astromer-science

Submitted: May 2, 2022