基于层级注意力模型的视频序列表情识别 Video Emotion Recognition Based on Hierarchical Attention Model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于层级注意力模型的视频序列表情识别

引用本文：	王晓华,潘丽娟,彭穆子,胡敏,金春花,任福继.基于层级注意力模型的视频序列表情识别[J].计算机辅助设计与图形学学报,2020,32(1):27-35.

作者姓名：	王晓华潘丽娟彭穆子胡敏金春花任福继

作者单位：	合肥工业大学计算机与信息学院人工智能学院合肥230601;淮阴工学院江苏省物联网移动互联技术工程实验室淮安223001;合肥工业大学计算机与信息学院人工智能学院合肥230601;德岛大学先端技术科学教育部德岛 7708502

基金项目：	实验室开放基金;国家自然科学基金

摘要：	长短期记忆网络(LSTM)广泛应用于视频序列的人脸表情识别,针对单层LSTM表达能力有限,在解决复杂问题时其泛化能力易受制约的不足,提出一种层级注意力模型:使用堆叠LSTM学习时间序列数据的分层表示,利用自注意力机制构建差异化的层级关系,并通过构造惩罚项,进一步结合损失函数优化网络结构,提升网络性能.在CK+和MMI数据集上的实验结果表明,由于构建了良好的层次级别特征,时间序列上的每一步都从更感兴趣的特征层级上挑选信息,相较于普通的单层LSTM,层级注意力模型能够更加有效地表达视频序列的情感信息.
关键词：	视频序列人脸表情识别堆叠长短期记忆网络自注意力机制
Video Emotion Recognition Based on Hierarchical Attention Model

Wang Xiaohua,Pan Lijuan,Peng Muzi,Hu Min,Jin Chunhua,Ren Fuji.Video Emotion Recognition Based on Hierarchical Attention Model[J].Journal of Computer-Aided Design & Computer Graphics,2020,32(1):27-35.

Authors:	Wang Xiaohua Pan Lijuan Peng Muzi Hu Min Jin Chunhua Ren Fuji

Affiliation:	(School of Computer Science and Information Engineering,School of Artificial Intelligence,Hefei University of Technology,Hefei 230601;The Laboratory for Internet of Things and Mobile Internet Technology of Jiangsu Province,Huaiyin Institute of Technology,Huai’an 223001;Graduate School of Advanced Technology&Science,University of Tokushima,Tokushima 7708502)

Abstract:	LSTM network is widely used in facial expression recognition of video sequences.In view of the limited representation ability of single-layer LSTM and the limitation of its generalization ability when solving complex problems,a hierarchical attention model is proposed.Hierarchical representation of time series data is learned by stacking LSTM,self-attention mechanism is used to construct differentiated hierarchical relationships,and a penalty term is constructed and further combined with the loss function to optimize the network performance.Experiments on CK+and MMI datasets,demonstrate that due to the construction of good hierarchical features,each step in time series can select information from the more interesting feature hierarchy.Compared with ordinary single-layer LSTM,hierarchical attention model can express the emotional information of video sequences more effectively.

Keywords:	video sequences facial expression recognition stacked long short-term memory network self-attention mechanism
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏