期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Predictively encoded graph convolutional network for noise-robust skeleton-based action recognition

Yoon Yongsang Yu Jongmin Jeon Moongu 《Applied Intelligence》2022,52(3):2317-2331

Applied Intelligence - In skeleton-based action recognition, graph convolutional networks (GCNs), which model human body skeletons using graphical components such as nodes and connections, have... 相似文献

2.

Triplet attention multiple spacetime-semantic graph convolutional network for skeleton-based action recognition

Sun Yanjing Huang Han Yun Xiao Yang Bin Dong Kaiwen 《Applied Intelligence》2022,52(1):113-126

Skeleton-based action recognition has recently attracted widespread attention in the field of computer vision. Previous studies on skeleton-based action recognition are susceptible to interferences from redundant video frames in judging complex actions but ignore the fact that the spatial-temporal features of different actions are extremely different. To solve these problems, we propose a triplet attention multiple spacetime-semantic graph convolutional network for skeleton-based action recognition (AM-GCN), which can not only capture the multiple spacetime-semantic feature from the video images to avoid limited information diversity from single-layer feature representation but can also improve the generalization ability of the network. We also present the triplet attention mechanism to apply an attention mechanism to different key points, key channels, and key frames of the actions, improving the accuracy and interpretability of the judgement of complex actions. In addition, different kinds of spacetime-semantic feature information are combined through the proposed fusion decision for comprehensive prediction in order to improve the robustness of the algorithm. We validate AM-GCN with two standard datasets, NTU-RGBD and Kinetics, and compare it with other mainstream models. The results show that the proposed model achieves tremendous improvement.

相似文献

3.

Dual-domain graph convolutional networks for skeleton-based action recognition

Chen Shuo Xu Ke Mi Zhongjie Jiang Xinghao Sun Tanfeng 《Machine Learning》2022,111(7):2381-2406

Machine Learning - Skeleton-based action recognition is attracting more and more attention owing to the general representation ability of skeleton data. The Graph Convolutional Networks (GCNs)... 相似文献

4.

Two-stream adaptive-attentional subgraph convolution networks for skeleton-based action recognition

Li Xianshan Meng Fengchan Zhao Fengda Guo Dingding Lou Fengwei Jing Rong 《Multimedia Tools and Applications》2022,81(4):4821-4838

Multimedia Tools and Applications - Recently, skeleton-based action recognition has modeled the human skeleton as a graph convolution network (GCN), and has achieved remarkable results. However,... 相似文献

5.

轻量级多信息图卷积神经网络动作识别方法

井望李汪根沈公仆范宝珠《计算机应用研究》2022,39(4):1247-1252

针对如何在保持低参数量和低计算量前提下构建高性能模型的问题,提出一种轻量级多信息图卷积神经网络（LMI-GCN）。LMI-GCN通过将关节坐标、关节速度、骨骼边、骨骼边速度四种不同信息编码至高维空间的方式进行信息融合,并引入可以聚合重要特征的多通道自适应图和分流时间卷积块以减少模型参数量。同时,提出一种随机池数据预处理方法。在NTU-RGB+D120数据集上与基线方法SGN（语义引导神经网络）相比,在两种评估设置cross-subject和cross-setup上提高5.4%和4.7%。实验结果表明,LMI-GCN性能高于SGN。相似文献

6.

基于时空注意力图卷积网络模型的人体骨架动作识别算法

李扬志袁家政刘宏哲《计算机应用》2021,41(7):1915-1921

针对现有的人体骨架动作识别算法不能充分发掘运动的时空特征问题,提出一种基于时空注意力图卷积网络(STA-GCN)模型的人体骨架动作识别算法.该模型包含空间注意力机制和时间注意力机制:空间注意力机制一方面利用光流特征中的瞬时运动信息定位运动显著的空间区域,另一方面在训练过程中引入全局平均池化及辅助分类损失使得该模型可以关... 相似文献

7.

Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure

Cao Yi Liu Chen Huang Zilong Sheng Yongjian Ju Yongjian 《Multimedia Tools and Applications》2021,80(19):29139-29162

Skeleton-based action recognition has recently achieved much attention since they can robustly convey the action information. Recently, many studies have shown that graph convolutional networks (GCNs), which generalize CNNs to more generic non-Euclidean structures, are more exactly extracts spatial feature. Nevertheless, how to effectively extract global temporal features is still a challenge. In this work, firstly, a unique feature named temporal action graph is designed. It first attempts to express timing relationship with the form of graph. Secondly, temporal adaptive graph convolution structure (T-AGCN) are proposed. Through generating global adjacency matrix for temporal action graph, it can flexibly extract global temporal features in temporal dynamics. Thirdly, we further propose a novel model named spatial-temporal adaptive graph convolutional network (ST-AGCN) for skeletons-based action recognition to extract spatial-temporal feature and improve action recognition accuracy. ST-AGCN combines T-AGCN with spatial graph convolution to make up for the shortage of T-AGCN for spatial structure. Besides, ST-AGCN uses dual features to form a two-stream network which is able to further improve action recognition accuracy for hard-to-recognition sample. Finally, comparsive experiments on the two skeleton-based action recognition datasets, NTU-RGBD and SBU, demonstrate that T-AGCN and temporal action graph can effective explore global temporal information and ST-AGCN achieves certain improvement of recognition accuracy on both datasets.

相似文献

8.

融合时空图卷积的多人交互行为识别

下载免费PDF全文

成科扬吴金霞王文杉荣兰詹永照《中国图象图形学报》2021,26(7):1681-1691

目的多人交互行为的识别在现实生活中有着广泛应用。现有的关于人类活动分析的研究主要集中在对单人简单行为的视频片段进行分类,而对于理解具有多人之间关系的复杂人类活动的问题还没有得到充分的解决。方法针对多人交互动作中两人肢体行为的特点,本文提出基于骨架的时空建模方法,将时空建模特征输入到广义图卷积中进行特征学习,通过谱图卷积的高阶快速切比雪夫多项式进行逼近。同时对骨架之间的交互信息进行设计,通过捕获这种额外的交互信息增加动作识别的准确性。为增强时域信息的提取,创新性地将切片循环神经网络（recurrent neural network,RNN）应用于视频动作识别,以捕获整个动作序列依赖性信息。结果本文在UT-Interaction数据集和SBU数据集上对本文算法进行评估,在UT-Interaction数据集中,与H-LSTCM（hierarchical long short-term concurrent memory）等算法进行了比较,相较于次好算法提高了0.7%,在SBU数据集中,相较于GCNConv（semi-supervised classification with graph convolutional networks）、RotClips+MTCNN（rotating cliips+multi-task convolutional neural netowrk）、SGC（simplifying graph convolutional）等算法分别提升了5.2%、1.03%、1.2%。同时也在SBU数据集中进行了融合实验,分别验证了不同连接与切片RNN的有效性。结论本文提出的融合时空图卷积的交互识别方法,对于交互类动作的识别具有较高的准确率,普遍适用于对象之间产生互动的行为识别。相似文献

9.

基于一致性图卷积模型的多模态对话情绪识别

谭晓聪郭军军线岩团相艳《计算机应用研究》2023,40(10):3100-3106

多模态对话情绪识别是一项根据对话中话语的文本、语音、图像模态预测其情绪类别的任务。针对现有研究主要关注话语上下文的多模态特征提取和融合,而没有充分考虑每个说话人情绪特征利用的问题,提出一种基于一致性图卷积网络的多模态对话情绪识别模型。该模型首先构建了多模态特征学习和融合的图卷积网络,获得每条话语的上下文特征;在此基础上,以说话人在完整对话中的平均特征为一致性约束,使模型学习到更合理的话语特征,从而提高预测情绪类别的性能。在两个基准数据集IEMOCAP和MELD上与其他基线模型进行了比较,结果表明所提模型优于其他模型。此外,还通过消融实验验证了一致性约束和模型其他组成部分的有效性。相似文献

10.

Expression recognition based on residual rectification convolution neural network

Chen Bin Zhu Jin-ning Dong Yi-zhou 《Multimedia Tools and Applications》2022,81(7):9671-9683

In order to solve the problem of low face recognition rate in controlled scene, an expression recognition algorithm based on residual rectification intensive convolutional neural network is proposed. This method takes convolutional neural network as the prototype. In the process of training model, the idea of residual network is introduced to correct the difference between the effect of test set and the effect of training set. The linear rectification operation of the residual block by the excitation function embedded in the convolution layer helps to express complex features. At the same time, the data intensive method is used to suppress the fast fitting of the deep neural network model during the training process, to improve its generalization performance on a given recognition task, and then to improve the robustness of the model learning effect. In the experiment, the method is applied to simulate the online teaching environment, and get effective facial expression recognition result in controlled scene. According to the experimental data, this method can effectively classify the facial image input under controlled conditions, and the highest accuracy is up to 91.7%. This research is helpful to the development of facial expression recognition and human-computer interaction.

相似文献

11.

骨骼信息的人体行为识别综述

下载免费PDF全文

卢健李萱峰赵博周健《中国图象图形学报》2023,28(12):3651-3669

基于骨骼信息的人体行为识别旨在从输入的包含一个或多个行为的骨骼序列中,正确地分析出行为的种类,是计算机视觉领域的研究热点之一。与基于图像的人体行为识别方法相比,基于骨骼信息的人体行为识别方法不受背景、人体外观等干扰因素的影响,具有更高的准确性、鲁棒性和计算效率。针对基于骨骼信息的人体行为识别方法的重要性和前沿性,对其进行全面和系统的总结分析具有十分重要的意义。本文首先回顾了9个广泛应用的骨骼行为识别数据集,按照数据收集视角的差异将它们分为单视角数据集和多视角数据集,并着重探讨了不同数据集的特点和用法。其次,根据算法所使用的基础网络,将基于骨骼信息的行为识别方法分为基于手工制作特征的方法、基于循环神经网络的方法、基于卷积神经网络的方法、基于图卷积网络的方法以及基于Transformer的方法,重点阐述分析了这些方法的原理及优缺点。其中,图卷积方法因其强大的空间关系捕捉能力而成为目前应用最为广泛的方法。采用了全新的归纳方法,对图卷积方法进行了全面综述,旨在为研究人员提供更多的思路和方法。最后,从8个方面总结现有方法存在的问题,并针对性地提出工作展望。相似文献

12.

基于深度时空残差卷积神经网络的课堂教学视频中多人课堂行为识别

黄勇康梁美玉王笑笑陈徵曹晓雯《计算机应用》2022,42(3):736-742

针对课堂教学场景遮挡严重、学生众多,以及目前的视频行为识别算法并不适用于课堂教学场景,且尚无学生课堂行为的公开数据集的问题,构建了课堂教学视频库以及学生课堂行为库,提出了基于深度时空残差卷积神经网络的课堂教学视频中实时多人学生课堂行为识别算法.首先,结合实时目标检测和跟踪,得到每个学生的实时图片流;接着,利用深度时空残... 相似文献

13.

时空双仿射微分不变量及骨架动作识别

下载免费PDF全文

李琪墨瀚林赵婧涵郝宏翔李华《中国图象图形学报》2021,26(12):2879-2891

目的人体骨架的动态变化对于动作识别具有重要意义。从关节轨迹的角度出发,部分对动作类别判定具有价值的关节轨迹传达了最重要的信息。在同一动作的每次尝试中,相应关节的轨迹一般具有相似的基本形状,但其具体形式会受到一定的畸变影响。基于对畸变因素的分析,将人体运动中关节轨迹的常见变换建模为时空双仿射变换。方法首先用一个统一的表达式以内外变换的形式将时空双仿射变换进行描述。基于变换前后轨迹曲线的微分关系推导设计了双仿射微分不变量,用于描述关节轨迹的局部属性。基于微分不变量和关节坐标在数据结构上的同构特点,提出了一种通道增强方法,使用微分不变量将输入数据沿通道维度扩展后,输入神经网络进行训练与评估,用于提高神经网络的泛化能力。结果实验在两个大型动作识别数据集NTU（Nanyang Technological University）RGB+D（NTU 60）和NTU RGB+D 120（NTU 120）上与若干最新方法及两种基线方法进行比较,在两种实验设置（跨参与者识别与跨视角识别）中均取得了明显的改进结果。相比于使用原始数据的时空图神经卷积网络（spatio-temporal graph convolutional networks,ST-GCN）,在NTU 60数据集中,跨参与者与跨视角的识别准确率分别提高了1.9%和3.0%;在NTU 120数据集中,跨参与者与跨环境的识别准确率分别提高了5.6%和4.5%。同时对比于数据增强,基于不变特征的通道增强方法在两种实验设置下都能有明显改善,更为有效地提升了网络的泛化能力。结论本文提出的不变特征与通道增强,直观有效地综合了传统特征和深度学习的优点,有效提高了骨架动作识别的准确性,改善了神经网络的泛化能力。相似文献

14.

Multi-cue based 3D residual network for action recognition

Zong Ming Wang Ruili Chen Zhe Wang Maoli Wang Xun Potgieter Johan 《Neural computing & applications》2021,33(10):5167-5181

Neural Computing and Applications - Convolutional neural network (CNN) is a natural structure for video modelling that has been successfully applied in the field of action recognition. The existing... 相似文献

15.

基于残差空洞卷积神经网络的网络安全实体识别方法

下载免费PDF全文

谢博申国伟郭春周燕于淼《网络与信息安全学报》2020,6(5):126-138

近年来,网络安全威胁日益增多,数据驱动的安全智能分析成为网络安全领域研究的热点。特别是以知识图谱为代表的人工智能技术可为多源异构威胁情报数据中的复杂网络攻击检测和未知网络攻击检测提供支撑。网络安全实体识别是威胁情报知识图谱构建的基础。开放网络文本数据中的安全实体构成非常复杂,导致传统的深度学习方法难以准确识别。在BERT（pre-training of deep bidirectional transformers）预训练语言模型的基础上,提出一种基于残差空洞卷积神经网络和条件随机场的网络安全实体识别模型 BERT-RDCNN-CRF。通过BERT模型训练字符级特征向量表示,结合残差卷积与空洞神经网络模型有效提取安全实体的重要特征,最后通过CRF获得每一个字符的BIO标注。在所构建的大规模网络安全实体标注数据集上的实验表明,所提方法取得了比LSTM-CRF模型、BiLSTM-CRF模型和传统的实体识别模型更好的效果。相似文献

16.

Human action recognition based on enhanced data guidance and key node spatial temporal graph convolution

Zhang Chengyu Liang Jiuzhen Li Xing Xia Yunfei Di Lan Hou Zhenjie Huan Zhan 《Multimedia Tools and Applications》2022,81(6):8349-8366

Multimedia Tools and Applications - Graph convolutional networks have achieved remarkable performance in action recognition from skeleton videos. However, most of the existing GCN-based methods... 相似文献

17.

基于对比学习的全局增强动态异质图神经网络

焦鹏飞刘欢吕乐高梦州张纪林刘栋《计算机研究与发展》2023,589(8):1808-1821

图神经网络由于其对图结构数据的强大表征能力近年来受到广泛关注. 现有图神经网络方法主要建模静态同质图数据,然而现实世界复杂系统往往包含多类型动态演化的实体及关系,此类复杂系统更适合建模为动态异质图. 目前,动态异质图表示学习方法主要集中于半监督学习范式,其存在监督信息昂贵和泛化性较差等问题. 针对以上问题,提出了一种基于对比学习的全局增强动态异质图神经网络. 具体地,所提网络首先通过异质层次化注意力机制根据历史信息来生成未来的邻近性保持的节点表示,然后通过对比学习最大化局部节点表示和全局图表示的互信息来丰富节点表示中的全局语义信息. 实验结果表明,提出的自监督动态异质图表示学习方法在多个真实世界数据集的链路预测任务上的AUC指标平均提升了3.95%.

相似文献

18.

利用初始残差和解耦操作的自适应深层图卷积

张继杰杨艳刘勇《计算机应用》2022,42(1):9-15

传统的图卷积网络(GCN)及其很多变体都是在浅层时达到最佳的效果,而没有充分利用图中节点的高阶邻居信息.随后产生的深层图卷积模型可以解决以上问题却又不可避免地产生了过平滑的问题,导致模型无法有效区分图中不同类别的节点.针对此问题,提出了一种利用初始残差和解耦操作的自适应深层图卷积模型ID-AGCN.首先,对节点的表示转... 相似文献

19.

基于增强型图卷积的骨架识别模型

兰红何璠张蒲芬《计算机应用研究》2021,38(12):3791-3795,3825

针对现有骨架动作识别主要采用双流框架,在提取时间空间以及通道特征方法上存在的问题,提出一个ADGCN,用于骨架动作识别.首先对骨架数据进行建模,分别将关节、骨骼及其关节和骨骼的运动信息输入到多流框架的单个流.然后将输入的数据传送到提出的有向图卷积网络中进行提取关节和骨骼之间的依赖关系,再利用提出的时空通道注意力网络(STCN),增强每层网络中关键关节的时间、空间以及通道的信息.最后将四个流的信息通过加权平均计算动作识别的精度,输出动作的预测结果.此模型在两个大型数据集NTU-RGB+D和Kinectics-Skeleton中进行训练和验证,验证的结果与基线方法DGNN(有向图神经网络)相比,在NTU-RGB+D数据集上,在两个交叉子集CS和CV上的准确率分别提升了2.43％和1.2％.在Kinectics-Skeleton数据集的top1和top5上的准确率分别提升了0.7％和0.9％.提出的ADGCN可以有效地增强骨架动作识别的性能,在两个大型数据集上的效果都有所提升. 相似文献

20.

ResLNet: deep residual LSTM network with longer input for action recognition

Tian WANG Jiakun LI Huai-Ning WU Ce LI Hichem SNOUSSI Yang WU 《Frontiers of Computer Science》2022,16(6):166334

Action recognition is an important research topic in video analysis that remains very challenging. Effective recognition relies on learning a good representation of both spatial information (for appearance) and temporal information (for motion). These two kinds of information are highly correlated but have quite different properties, leading to unsatisfying results of both connecting independent models (e.g., CNN-LSTM) and direct unbiased co-modeling (e.g., 3DCNN). Besides, a long-lasting tradition on this task with deep learning models is to just use 8 or 16 consecutive frames as input, making it hard to extract discriminative motion features. In this work, we propose a novel network structure called ResLNet (Deep Residual LSTM network), which can take longer inputs (e.g., of 64 frames) and have convolutions collaborate with LSTM more effectively under the residual structure to learn better spatial-temporal representations than ever without the cost of extra computations with the proposed embedded variable stride convolution. The superiority of this proposal and its ablation study are shown on the three most popular benchmark datasets: Kinetics, HMDB51, and UCF101. The proposed network could be adopted for various features, such as RGB and optical flow. Due to the limitation of the computation power of our experiment equipment and the real-time requirement, the proposed network is tested on the RGB only and shows great performance. 相似文献