Cognitively Inspired Benchmark
Cognitively inspired benchmarks evaluate artificial intelligence models' abilities to perform tasks mirroring human cognitive development, such as understanding conservation of quantity, perspective-taking, and mechanical reasoning. Current research focuses on using these benchmarks to probe the capabilities of large vision-language models (LVLMs) and large language models (LLMs), often revealing discrepancies between model performance and human-like intelligence, particularly in higher-order cognitive functions. These benchmarks are crucial for identifying limitations in current AI architectures and guiding the development of more sophisticated models that better approximate human cognitive abilities, ultimately advancing both AI research and our understanding of human cognition itself.
Papers
Vision Language Models Know Law of Conservation without Understanding More-or-Less
Dezhi Luo, Haiyun Lyu, Qingying Gao, Haoran Sun, Yijiang Li, Hokin Deng
Vision Language Models See What You Want but not What You See
Qingying Gao, Yijiang Li, Haiyun Lyu, Haoran Sun, Dezhi Luo, Hokin Deng
Probing Mechanical Reasoning in Large Vision Language Models
Haoran Sun, Qingying Gao, Haiyun Lyu, Dezhi Luo, Hokin Deng, Yijiang Li