Human Understanding
Human understanding, a multifaceted field encompassing cognitive processes and AI model capabilities, seeks to unravel how humans and machines comprehend information. Current research focuses on improving AI's ability to understand nuanced language, visual information, and complex relationships within data, employing techniques like multimodal large language models, hypergraph attention networks, and retrieval-augmented generation. These advancements have implications for various applications, including improved medical diagnosis, enhanced human-computer interaction, and more effective scientific knowledge extraction, but challenges remain in achieving truly robust and generalizable understanding in AI.
Papers
ULTra: Unveiling Latent Token Interpretability in Transformer Based Understanding
Hesam Hosseini, Ghazal Hosseini Mighan, Amirabbas Afzali, Sajjad Amini, Amir Houmansadr
Towards Utilising a Range of Neural Activations for Comprehending Representational Associations
Laura O'Mahony, Nikola S. Nikolov, David JP O'Sullivan
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng, Tongjia Chen, Shoubin Yu, Taojiannan Yang, Lincoln Spencer, Yapeng Tian, Ajmal Saeed Mian, Mohit Bansal, Chen Chen
Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective
Shenghao Xie, Wenqiang Zu, Mingyang Zhao, Duo Su, Shilong Liu, Ruohua Shi, Guoqi Li, Shanghang Zhang, Lei Ma
ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding
Kimihiro Hasegawa, Wiradee Imrattanatrai, Zhi-Qi Cheng, Masaki Asada, Susan Holm, Yuran Wang, Ken Fukuda, Teruko Mitamura
MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding
Yuan Wang, Di Huang, Yaqi Zhang, Wanli Ouyang, Jile Jiao, Xuetao Feng, Yan Zhou, Pengfei Wan, Shixiang Tang, Dan Xu