Paper ID: 2203.13052

Coarse-to-Fine Cascaded Networks with Smooth Predicting for Video Facial Expression Recognition

Fanglei Xue, Zichang Tan, Yu Zhu, Zhongsong Ma, Guodong Guo

Facial expression recognition plays an important role in human-computer interaction. In this paper, we propose the Coarse-to-Fine Cascaded network with Smooth Predicting (CFC-SP) to improve the performance of facial expression recognition. CFC-SP contains two core components, namely Coarse-to-Fine Cascaded networks (CFC) and Smooth Predicting (SP). For CFC, it first groups several similar emotions to form a rough category, and then employs a network to conduct a coarse but accurate classification. Later, an additional network for these grouped emotions is further used to obtain fine-grained predictions. For SP, it improves the recognition capability of the model by capturing both universal and unique expression features. To be specific, the universal features denote the general characteristic of facial emotions within a period and the unique features denote the specific characteristic at this moment. Experiments on Aff-Wild2 show the effectiveness of the proposed CFSP. We achieved 3rd place in the Expression Classification Challenge of the 3rd Competition on Affective Behavior Analysis in-the-wild. The code will be released at https://github.com/BR-IDL/PaddleViT.

Submitted: Mar 24, 2022