基于对比学习的细粒度遮挡人脸表情识别 Fine-grained Occluded Facial Expression Recognition Based on Contrastive Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于对比学习的细粒度遮挡人脸表情识别

引用本文：	奚琰.基于对比学习的细粒度遮挡人脸表情识别[J].计算机系统应用,2022,31(11):175-183.

作者姓名：	奚琰

作者单位：	江苏大学计算机科学与通信工程学院, 镇江 212013

摘要：	和实验室环境不同, 现实生活中的人脸表情图像场景复杂, 其中最常见的局部遮挡问题会造成面部外观的显著改变, 使得模型提取到的全局特征包含与情感无关的冗余信息从而降低了判别力. 针对此问题, 本文提出了一种结合对比学习和通道-空间注意力机制的人脸表情识别方法, 学习各局部显著情感特征并关注局部特征与全局特征之间的关系. 首先引入对比学习, 通过特定的数据增强方法设计新的正负样本选取策略, 对大量易获得的无标签情感数据进行预训练, 学习具有感知遮挡能力的表征, 再将此表征迁移到下游人脸表情识别任务以提高识别性能. 在下游任务中, 将每张人脸图像的表情分析问题转化为多个局部区域的情感检测问题, 使用通道-空间注意力机制学习人脸不同局部区域的细粒度注意力图, 并对加权特征进行融合, 削弱遮挡内容带来的噪声影响, 最后提出约束损失联合训练, 优化最终用于分类的融合特征. 实验结果表明, 无论是在公开的非遮挡人脸表情数据集(RAF-DB和FER2013)还是人工合成的遮挡人脸表情数据集上, 所提方法都取得了与现有先进方法可媲美的结果.
关键词：	人脸表情识别对比学习局部遮挡注意力机制深度学习
收稿时间：	2022/3/5 0:00:00
修稿时间：	2022/4/12 0:00:00
Fine-grained Occluded Facial Expression Recognition Based on Contrastive Learning

XI Yan.Fine-grained Occluded Facial Expression Recognition Based on Contrastive Learning[J].Computer Systems& Applications,2022,31(11):175-183.

Authors:	XI Yan

Affiliation:	School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China

Abstract:	Different from the laboratory environment, the scenes of facial expression images in real life are complex, and local occlusion, the most common problem, will cause a significant change in the facial appearance. As a result, the global feature extracted by a model contains redundant information unrelated to emotions, which reduces the discrimination of the model. Considering this problem, a facial expression recognition method combining contrastive learning and the channel-spatial attention mechanism is proposed in this study, which learns local salient emotion features and pays attention to the relationship between local features and global features. Firstly, contrastive learning is introduced. A new positive and negative sample selection strategy is designed through a specific data augmentation method, and a large amount of easily accessible unlabeled emotion data is pre-trained to learn the representation with occlusion-aware ability. Then, the representation is transferred to the downstream facial expression recognition task to improve recognition performance. In the downstream task, the expression analysis of each face image is transformed into the emotion detection of multiple local regions. The fine-grained attention maps of different local regions of a face are learned using the channel-spatial attention mechanism, and the weighted features are fused to weaken the noise effect caused by the occlusion content. Finally, the constraint loss for joint training is proposed to optimize the final fusion feature for classification. The experimental results indicate that the proposed method achieves comparable results to existing state-of-the-art methods on both public non-occluded facial expression datasets (RAF-DB and FER2013) and synthetic occluded facial expression datasets.

Keywords:	facial expression recognition contrastive learning local occlusion attention mechanism deep learning

	点击此处可从《计算机系统应用》浏览原始摘要信息
	点击此处可从《计算机系统应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏