首页 | 本学科首页   官方微博 | 高级检索  
     

联合均等采样随机擦除和全局时间特征池化的视频行人重识别方法
引用本文:陈莉,王洪元,张云鹏,曹亮,殷雨昌.联合均等采样随机擦除和全局时间特征池化的视频行人重识别方法[J].计算机应用,2021,41(1):164-169.
作者姓名:陈莉  王洪元  张云鹏  曹亮  殷雨昌
作者单位:常州大学 计算机与人工智能学院 阿里云大数据学院, 江苏 常州 213164
基金项目:国家自然科学基金资助项目
摘    要:针对为解决视频监控中遮挡、背景物干扰,以及行人外观、姿势相似性等因素导致的视频行人重识别准确率较低的问题,提出了联合均等采样随机擦除和全局时间特征池化的视频行人重识别方法。首先针对目标行人被干扰或部分遮挡的情况,采用了均等采样随机擦除(ESE)的数据增强方法来有效地缓解遮挡,提高模型的泛化能力,更准确地匹配行人;其次为了进一步提高视频行人重识别的精度,学习更有判别力的特征表示,使用三维卷积神经网络(3DCNN)提取时空特征,并在网络输出行人特征表示前加上全局时间特征池化层(GTFP),这样既能获取上下文的空间信息,又能细化帧与帧之间的时序信息。通过在MARS、DukeMTMC-VideoReID 和PRID-2011三个公共视频数据集上的大量实验,证明所提出的联合均等采样随机擦除和全局时间特征池化的方法,相较于目前一些先进的视频行人重识别方法,具有一定的竞争力。

关 键 词:视频行人重识别  三维卷积神经网络  全局时间特征表示  均等采样随机擦除  数据增强  
收稿时间:2020-05-31
修稿时间:2020-07-16

Video-based person re-identification method by jointing evenly sampling-random erasing and global temporal feature pooling
CHEN Li,WANG Hongyuan,ZHANG Yunpeng,CAO Liang,YIN Yuchang.Video-based person re-identification method by jointing evenly sampling-random erasing and global temporal feature pooling[J].journal of Computer Applications,2021,41(1):164-169.
Authors:CHEN Li  WANG Hongyuan  ZHANG Yunpeng  CAO Liang  YIN Yuchang
Affiliation:School of Computer Science and Artificial Intelligence Aliyun School of Big Data, Changzhou University, Changzhou Jiangsu 213164, China
Abstract:In order to solve the problem of low accuracy of video-based person re-identification caused by factors such as occlusion,background interference,and person appearance and posture similarity in video surveillance,a video-based person re-identification method of Evenly Sampling-random Erasing(ESE) and global temporal feature pooling was proposed. Firstly,aiming at the situation where the object person is disturbed or partially occluded,a data enhancement method of evenly sampling-random erasing was adopted to effectively alleviate the occlusion problem,improving the generalization ability of the model,so as to more accurately match the person. Secondly,to further improve the accuracy of video-based person re-identification,and learn more discriminative feature representations,a 3D Convolutional Neural Network(3DCNN)was used to extract temporal and spatial features. And a Global Temporal Feature Pooling(GTFP)layer was added to the network before the output of person feature representations,so as to ensure the obtaining of spatial information of the context,and refine the intra-frame temporal information. Lots of experiments conducted on three public video datasets,MARS,DukeMTMC-VideoReID and PRID-2011,prove that the method of jointing evenly sampling-random erasing and global temporal feature pooling is competitive compared with some state-of-the-art video-based person reidentification methods.
Keywords:video-based person re-identification  3D Convolutional Neural Network(3DCNN)  global temporal feature representation  Evenly Sampling-random Erasing(ESE)  data augmentation
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号