Large Scale
Large-scale data processing and analysis are central to addressing numerous scientific and engineering challenges, focusing on efficient handling of massive datasets and complex systems. Current research emphasizes developing novel algorithms and model architectures, such as graph neural networks, deep learning models, and physics-guided machine learning, to improve efficiency, accuracy, and scalability in diverse applications. These advancements are crucial for tackling problems ranging from traffic optimization and robot navigation to astronomical surveys and the development of more energy-efficient AI systems. The resulting insights and tools have significant implications across various fields, enabling more effective data-driven decision-making and scientific discovery.
Papers
Scalable Training of Trustworthy and Energy-Efficient Predictive Graph Foundation Models for Atomistic Materials Modeling: A Case Study with HydraGNN
Massimiliano Lupo Pasini, Jong Youl Choi, Kshitij Mehta, Pei Zhang, David Rogers, Jonghyun Bae, Khaled Z. Ibrahim, Ashwin M. Aji, Karl W. Schulz, Jorda Polo, Prasanna Balaprakash
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Qingyun Li, Zhe Chen, Weiyun Wang, Wenhai Wang, Shenglong Ye, Zhenjiang Jin, Guanzhou Chen, Yinan He, Zhangwei Gao, Erfei Cui, Jiashuo Yu, Hao Tian, Jiasheng Zhou, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Zhenxiang Li, Pei Chu, Yi Wang, Min Dou, Changyao Tian, Xizhou Zhu, Lewei Lu, Yushi Chen, Junjun He, Zhongying Tu, Tong Lu, Yali Wang, Limin Wang, Dahua Lin, Yu Qiao, Botian Shi, Conghui He, Jifeng Dai