Paper ID: 2503.22697 • Published Mar 15, 2025
From Eye to Mind: brain2text Decoding Reveals the Neural Mechanisms of Visual Semantic Processing
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Deciphering the neural mechanisms that transform sensory experiences into
meaningful semantic representations is a fundamental challenge in cognitive
neuroscience. While neuroimaging has mapped a distributed semantic network, the
format and neural code of semantic content remain elusive, particularly for
complex, naturalistic stimuli. Traditional brain decoding, focused on visual
reconstruction, primarily captures low-level perceptual features, missing the
deeper semantic essence guiding human cognition. Here, we introduce a paradigm
shift by directly decoding fMRI signals into textual descriptions of viewed
natural images. Our novel deep learning model, trained without visual input,
achieves state-of-the-art semantic decoding performance, generating meaningful
captions that capture the core semantic content of complex scenes.
Neuroanatomical analysis reveals the critical role of higher-level visual
regions, including MT+, ventral stream visual cortex, and inferior parietal
cortex, in this semantic transformation. Category-specific decoding further
demonstrates nuanced neural representations for semantic dimensions like
animacy and motion. This text-based decoding approach provides a more direct
and interpretable window into the brain's semantic encoding than visual
reconstruction, offering a powerful new methodology for probing the neural
basis of complex semantic processing, refining our understanding of the
distributed semantic network, and potentially inspiring brain-inspired language
models.