A dual-channel ensembled deep convolutional neural network for facial expression recognition in the wild |
| |
Authors: | Sumeet Saurav Ravi Saini Sanjay Singh |
| |
Affiliation: | 1. Faculty of Engineering Sciences, Academy of Scientific & Innovative Research (AcSIR), Ghaziabad, Uttar 2. Pradesh, India;3. Pradesh, India Intelligent Systems Group, CSIR-Central Electronics Engineering Research Institute, Pilani, Rajasthan, India |
| |
Abstract: | Facial expression recognition (FER) in the wild is an active and challenging field of research. A system for automatic FER finds use in a wide range of applications related to advanced human–computer interaction (HCI), human–robot interaction (HRI), human behavioral analysis, gaming and entertainment, etc. Since their inception, convolutional neural networks (CNNs) have attained state-of-the-art accuracy in the facial analysis task. However, recognizing facial expressions in the wild with high confidence running on a low-cost embedded device remains challenging. To this end, this study presents an efficient dual-channel ensembled deep CNN (DCE-DCNN) for FER in the wild. Initially, two DCNNs, namely the and , are trained separately on the grayscale and Scharr-convolved vertical gradient facial images, respectively. The proposed network later integrates the two pre-trained DCNNs to obtain the dual-channel integrated DCNN (DCI-DCNN). Finally, all three neural networks, namely the , , and DCI-DCNN, are jointly fine-tuned to get a single dual-channel-multi-output model. The multi-output model produces three prediction scores for the given input facial image. The prediction scores are thus fused using the max-voting ensemble scheme to obtain the DCE-DCNN with the final classification label. On the FER2013, RAF-DB, NCAER-S, AffectNet, and CKPlus benchmark FER datasets, the proposed DCE-DCNN consistently outperforms the two individual DCNNs and numerous state-of-the-art CNNs. Moreover, the network achieves competitive recognition accuracy on all four FER in the wild datasets with reduced memory storage size and parameters. The proposed DCE-DCNN model with high throughput on resource-limited embedded devices is suitable for applications that seek real-time classification of facial expressions in the wild with high confidence. |
| |
Keywords: | deep convolutional neural network dual-channel ensemble DCNN facial expression recognition joint fine-tuning |
|
|