Multiple Source

Multiple source data integration focuses on leveraging information from diverse and often heterogeneous sources to improve the accuracy and robustness of various machine learning tasks. Current research emphasizes ensemble methods combining traditional deep learning models (like RNNs) with large language models (LLMs), exploring techniques like retrieval-augmented generation and prompt engineering to effectively integrate knowledge from different sources. This approach is proving valuable across numerous applications, including biomedical natural language processing, urban travel modeling, and carbon accounting, by enhancing model performance and addressing limitations of single-source data.

Papers