Data Scarce

Data scarcity is a pervasive challenge in many scientific domains, hindering the effective application of machine learning. Current research focuses on developing data-efficient methods, including contrastive learning, transfer learning, and the integration of physics-based models or existing scientific knowledge with deep learning architectures like LLMs and capsule networks. These approaches aim to improve model performance and generalizability in low-data regimes, impacting fields ranging from materials science and medical image analysis to natural language processing and argument mining. The ultimate goal is to enable reliable and accurate machine learning even when labeled data is limited.

Papers