Action Quality Assessment

Action quality assessment (AQA) uses computer vision to automatically evaluate the quality of human actions in videos, aiming for objective and consistent scoring. Current research focuses on improving AQA's accuracy and interpretability, employing various deep learning architectures like transformers and incorporating techniques such as probabilistic modeling, multi-modal fusion (combining visual and audio data), and continual learning to handle diverse actions and data limitations. This field is significant for applications ranging from sports judging and athletic training to medical procedure evaluation, offering the potential for more efficient and reliable performance assessment across numerous domains.

Papers