Neural Execution Tree

Neural Execution Trees (NExT) represent a novel approach to enhancing large language models (LLMs) by incorporating an understanding of program execution or, more broadly, causal reasoning across various modalities. Current research focuses on improving LLMs' ability to reason about code execution traces, understand causal relationships in videos, and handle multimodal interactions (text, images, video, audio) through architectures that leverage execution traces or location embeddings. This work is significant because it addresses limitations in current LLMs, leading to improved performance in tasks requiring complex reasoning and multimodal understanding, with applications ranging from program debugging to video question answering and more general-purpose AI agents.

Papers