首页 | 本学科首页   官方微博 | 高级检索  
     

Grenander时间结构学习与推理优化下的行为识别
引用本文:吴克伟,高涛,谢昭,郭文斌.Grenander时间结构学习与推理优化下的行为识别[J].软件学报,2022,33(5):1865-1879.
作者姓名:吴克伟  高涛  谢昭  郭文斌
作者单位:大数据知识工程教育部重点实验室(合肥工业大学), 安徽 合肥 230601;情感计算与先进智能机器安徽省重点实验室(合肥工业大学), 安徽 合肥 230601;合肥工业大学 计算机与信息学院, 安徽 合肥 230601
基金项目:国家重点研发计划(2017YFB1002203); 国家自然科学基金(61503111); 安徽省自然科学基金(1808085MF168); 中央高校基本科研业务费专项资金(PA2020GDSK0059)
摘    要:针对现有基于视频整体时间结构建模的行为识别方法中,存在的时间噪声信息和歧义信息干扰现象,从而引起行为类别识别错误的问题,提出一种新型的Grenander推理优化下时间图模型(temporal graph model with Grenander inference, TGM-GI).首先,构建3D CNN-LSTM模块,其中3D CNN用于行为的动态特征提取, LSTM模块用于该特征的时间依赖关系优化.其次,在深度模块基础上,利用Grenander理论构建了行为识别的时间图模型,并设计了两个模块分别处理慢行为时间冗余和异常行为干扰问题,实现了时间噪声抑制下的时间结构提议.随后,设计融合特征约束和语义约束的Grenander测度,并提出一种时序增量形式的Viterbi算法,修正了行为时间模式中的歧义信息.最后,采用基于动态时间规划的模式匹配方法,完成了基于时间模式的行为识别任务.在UCF101和Olympic Sports两个公认数据集上,与现有多种基于深度学习的行为识别方法进行比较,该方法获得了最好的行为识别正确率.该方法优于基准的3D CNN-LSTM方法,在UCF101数据集上识别...

关 键 词:行为识别  时间模式  Grenander时间图模型  深度模型  动态时间规划
收稿时间:2020/5/8 0:00:00
修稿时间:2020/6/27 0:00:00

Temporal Structure Learning with Grenander Inference for Action Recognition
WU Ke-Wei,GAO Tao,XIE Zhao,GUO Wen-Bin.Temporal Structure Learning with Grenander Inference for Action Recognition[J].Journal of Software,2022,33(5):1865-1879.
Authors:WU Ke-Wei  GAO Tao  XIE Zhao  GUO Wen-Bin
Affiliation:Key Laboratory of Knowledge Engineering with Big Data (Hefei University of Technology), Ministry of Education, Hefei 230601, China;Anhui Province Key Laboratory of Affective Computing & Advanced Intelligent Machine (Hefei University of Technology), Hefei 230601, China;School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
Abstract:Action recognition is one crucial and very challenging task in computer vision. Most of the existing methods use the temporal structure of the whole video and ignore its temporal noise and ambiguity feature, which leads to failure in action recognition. To address this problem, a novel temporal graph model is proposed with Grenander inference, namely, TGM-GI. First, a 3D CNN+ LSTM module is constructed to learn deep features, in which 3D CNN extracts the dynamic feature of video clips and LSTM optimizes the time dependence between features of two clips. Second, a temporal graph model is constructed with these deep features which use the generator space of Grenander theory. The original temporal pattern is modified using two operations, in which combination operation can remove redundancy clips like slow motion and denoise operation can remove low-frequency clips like abnormal motion. Third, an incremental Viterbi algorithm is proposed for temporal pattern learning with Grenander inference, in which a Grenander measure is designed with both feature bond and semantic bond. Finally, the dynamic time warping is used to match the Grenander temporal pattern of test video with the Grenander temporal pattern of the training set and the label of the test video is predicted. The experimental results show that the proposed TGM-GI outperforms the state-of-the-art methods on two acknowledge databases. The TGM-GI is superior to the baseline method of 3D CNN-LSTM, and its accuracy improves 6.41% on the UCF101 dataset and 5.67% on the Olympic Sports dataset respectively.
Keywords:
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号