Human Understanding
Human understanding, a multifaceted field encompassing cognitive processes and AI model capabilities, seeks to unravel how humans and machines comprehend information. Current research focuses on improving AI's ability to understand nuanced language, visual information, and complex relationships within data, employing techniques like multimodal large language models, hypergraph attention networks, and retrieval-augmented generation. These advancements have implications for various applications, including improved medical diagnosis, enhanced human-computer interaction, and more effective scientific knowledge extraction, but challenges remain in achieving truly robust and generalizable understanding in AI.
Papers
PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology
Yuxuan Sun, Hao Wu, Chenglu Zhu, Sunyi Zheng, Qizi Chen, Kai Zhang, Yunlong Zhang, Dan Wan, Xiaoxiao Lan, Mengyue Zheng, Jingxiong Li, Xinheng Lyu, Tao Lin, Lin Yang
M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
Qingpei Guo, Furong Xu, Hanxiao Zhang, Wang Ren, Ziping Ma, Lin Ju, Jian Wang, Jingdong Chen, Ming Yang
Understanding and Estimating Domain Complexity Across Domains
Katarina Doctor, Mayank Kejriwal, Lawrence Holder, Eric Kildebeck, Emma Resmini, Christopher Pereyda, Robert J. Steininger, Daniel V. Olivença
BSL: Understanding and Improving Softmax Loss for Recommendation
Junkang Wu, Jiawei Chen, Jiancan Wu, Wentao Shi, Jizhi Zhang, Xiang Wang
BloomVQA: Assessing Hierarchical Multi-modal Comprehension
Yunye Gong, Robik Shrestha, Jared Claypoole, Michael Cogswell, Arijit Ray, Christopher Kanan, Ajay Divakaran