Surgical Action

Surgical action recognition aims to automatically identify and classify actions performed during surgical procedures from video data, primarily to improve surgical training, workflow optimization, and the development of computer-assisted systems. Current research focuses on developing robust computer vision models, including vision transformers and temporal convolutional networks, often incorporating hierarchical structures to capture both fine-grained actions and broader procedural context. These advancements leverage diverse data sources, such as RGB-D video and sensor fusion, to improve accuracy and reliability, ultimately contributing to safer and more efficient surgical practices.

Papers