Spatial-Temporal Alignment Network for Action Recognition [2308.09897]