首页 | 本学科首页   官方微博 | 高级检索  
     


A novel 2D and 3D multimodal approach for in-the-wild facial expression recognition
Affiliation:1. Beijing University of Posts and Telecommunications, Key Laboratory of Trustworthy Distributed Computing and Service (BUPT), Information and Communication Engineering, No. 10, Xi Tu Cheng Road, Beijing 100876, China;2. University of Science and Technology Beijing, School of Automation and Electrical Engineering, No. 30, Xue Yuan Road, Beijing 100083, China;1. School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, China;2. Ecole Centrale de Lyon, LIRIS UMR5205, Lyon, France;3. State Key Laboratory of Software Development Environment, School of Computer Science and Engineering, Beihang University, Beijing, China;4. School of Management, Xi’an Jiaotong University, Xi’an, China;5. Université Lyon 1, Institut Camille Jordan, Lyon, France;6. GMSV Research Center, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
Abstract:This study proposes a novel deep learning approach for the fusion of 2D and 3D modalities in in-the-wild facial expression recognition (FER). Different from other studies, we exploit the 3D facial information in in-the-wild FER. In particular, in-the-wild 3D FER dataset is not widely available; therefore, 3D facial data are constructed from available 2D datasets thanks to recent advances in 3D face reconstruction. The 3D facial geometry features are then extracted by deep learning technique to exploit the mid-level details, which provides meaningful expression for the recognition. In addition, to demonstrate the potential of 3D data on FER, the 2D projected images of 3D faces are taken as additional input to FER. These features are then jointly fused with 2D features obtained from the original input. The fused features are then classified by support vector machines (SVMs). The results show that the proposed approach achieves state-of-the-art recognition performances on Real-World Affective Faces (RAF) and Static Facial Expressions in the Wild (SFEW 2.0), and AffectNet dataset. This approach is also applied to a 3D FER dataset, i.e. BU-3DFE, to compare the effectiveness of reconstructed and available 3D face data for FER. This is the first time such a deep learning combination of 3D and 2D facial modalities is presented in the context of in-the-wild FER.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号