首页 | 本学科首页   官方微博 | 高级检索  
     

基于层级注意力模型的视频序列表情识别
引用本文:王晓华,潘丽娟,彭穆子,胡敏,金春花,任福继.基于层级注意力模型的视频序列表情识别[J].计算机辅助设计与图形学学报,2020,32(1):27-35.
作者姓名:王晓华  潘丽娟  彭穆子  胡敏  金春花  任福继
作者单位:合肥工业大学计算机与信息学院人工智能学院 合肥230601;淮阴工学院江苏省物联网移动互联技术工程实验室 淮安223001;合肥工业大学计算机与信息学院人工智能学院 合肥230601;德岛大学先端技术科学教育部 德岛 7708502
基金项目:实验室开放基金;国家自然科学基金
摘    要:长短期记忆网络(LSTM)广泛应用于视频序列的人脸表情识别,针对单层LSTM表达能力有限,在解决复杂问题时其泛化能力易受制约的不足,提出一种层级注意力模型:使用堆叠LSTM学习时间序列数据的分层表示,利用自注意力机制构建差异化的层级关系,并通过构造惩罚项,进一步结合损失函数优化网络结构,提升网络性能.在CK+和MMI数据集上的实验结果表明,由于构建了良好的层次级别特征,时间序列上的每一步都从更感兴趣的特征层级上挑选信息,相较于普通的单层LSTM,层级注意力模型能够更加有效地表达视频序列的情感信息.

关 键 词:视频序列  人脸表情识别  堆叠长短期记忆网络  自注意力机制

Video Emotion Recognition Based on Hierarchical Attention Model
Wang Xiaohua,Pan Lijuan,Peng Muzi,Hu Min,Jin Chunhua,Ren Fuji.Video Emotion Recognition Based on Hierarchical Attention Model[J].Journal of Computer-Aided Design & Computer Graphics,2020,32(1):27-35.
Authors:Wang Xiaohua  Pan Lijuan  Peng Muzi  Hu Min  Jin Chunhua  Ren Fuji
Affiliation:(School of Computer Science and Information Engineering,School of Artificial Intelligence,Hefei University of Technology,Hefei 230601;The Laboratory for Internet of Things and Mobile Internet Technology of Jiangsu Province,Huaiyin Institute of Technology,Huai’an 223001;Graduate School of Advanced Technology&Science,University of Tokushima,Tokushima 7708502)
Abstract:LSTM network is widely used in facial expression recognition of video sequences.In view of the limited representation ability of single-layer LSTM and the limitation of its generalization ability when solving complex problems,a hierarchical attention model is proposed.Hierarchical representation of time series data is learned by stacking LSTM,self-attention mechanism is used to construct differentiated hierarchical relationships,and a penalty term is constructed and further combined with the loss function to optimize the network performance.Experiments on CK+and MMI datasets,demonstrate that due to the construction of good hierarchical features,each step in time series can select information from the more interesting feature hierarchy.Compared with ordinary single-layer LSTM,hierarchical attention model can express the emotional information of video sequences more effectively.
Keywords:video sequences  facial expression recognition  stacked long short-term memory network  self-attention mechanism
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号