首页 | 本学科首页   官方微博 | 高级检索  
     

基于视角置信度和注意力的暴力行为识别
引用本文:夏良伟,朱明.基于视角置信度和注意力的暴力行为识别[J].计算机系统应用,2023,32(9):211-220.
作者姓名:夏良伟  朱明
作者单位:中国科学技术大学 信息科学技术学院, 合肥 230026
基金项目:科技创新特区计划(20-163-14-LZ-001-004-01)
摘    要:暴力行为容易出现遮挡情况, 识别准确率较低. 目前, 一些算法加入多视角视频输入来解决遮挡问题, 以等量权重将所有视角数据融合, 但是不同视角的视频因拍摄距离和遮挡情况本身就对识别存在差异性. 针对该问题, 本文提出一种基于视角置信度和注意力的暴力行为识别方法, 提高暴力识别的准确率. 本文将时序差分模块TDM的输入扩展成多视角, 将通道注意力机制运用在片段维度来增强TDM中跨段特征提取能力, 通过背景抑制方法突显移动目标的纹理特征并计算出每个视角图像的置信度, 引入双线性池化方法融合多视角视频特征, 根据视角置信度分配每个视角局部特征的权重. 本文在公开数据集CASIA-Action和自制数据集上进行了验证. 实验表明, 本文提出的视角置信度方法优于改进前的双线性池化方法, 暴力行为准确率相较于现有的行为识别方法取得了更好的效果.

关 键 词:暴力行为识别  注意力  双线性池化  视角置信度
收稿时间:2023/2/20 0:00:00
修稿时间:2023/3/20 0:00:00

Violence Recognition Based on View Confidence and Attention
XIA Liang-Wei,ZHU Ming.Violence Recognition Based on View Confidence and Attention[J].Computer Systems& Applications,2023,32(9):211-220.
Authors:XIA Liang-Wei  ZHU Ming
Affiliation:School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China
Abstract:Violence can be easily occluded, and the recognition accuracy is low. At present, some algorithms add multi-view video input to solve the occlusion problem and fuse all view data with equal weight. However, video from different views differs in recognition due to shooting distance and occlusion itself. To solve this problem, this study proposes a violence recognition method based on view confidence and attention to improve the accuracy of violence recognition. The input of the temporal difference module (TDM) is expanded to a multi-view angle. The channel attention mechanism is applied to the segment dimension to enhance the ability of cross-segment feature extraction in TDM. The background suppression method is used to highlight the texture features of moving objects and calculate the image confidence of each view. The bilinear pooling method is introduced to fuse multi-view video features, and the weight of local features of each view is assigned according to the view confidence. In this study, validation is performed on both the public dataset CASIA-Action and the self-made dataset. Experiments show that the view confidence method proposed in this study is better than the bilinear pooling method before improvement, and the accuracy of violence recognition is better than that of the existing behavior recognition methods.
Keywords:violence recognition  attention  bilinear pooling  view confidence
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号