首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
Human pose estimation aims at predicting the poses of human body parts in images or videos. Since pose motions are often driven by some specific human actions, knowing the body pose of a human is critical for action recognition. This survey focuses on recent progress of human pose estimation and its application to action recognition. We attempt to provide a comprehensive review of recent bottom-up and top-down deep human pose estimation models, as well as how pose estimation systems can be used for action recognition. Thanks to the availability of commodity depth sensors like Kinect and its capability for skeletal tracking, there has been a large body of literature on 3D skeleton-based action recognition, and there are already survey papers such as [1] about this topic. In this survey, we focus on 2D skeleton-based action recognition where the human poses are estimated from regular RGB images instead of depth images. We summarize the performance of recent action recognition methods that use pose estimated from color images as input, then show that there is much room for improvements in this direction.  相似文献   

2.
针对人脸光照、遮挡、身份、表情等因素变化的人脸姿态估计难题,结合稀疏表示分类(SRC)方法的优秀识别性能,对SRC理论进行了深入分析,并将其应用于人脸姿态分类.为了解决姿态估计中人脸光照、噪声和遮挡变化问题,将人脸姿态离散化为不同的子空间,每个子空间对应一个类别,据此,提出基于字典学习与稀疏约束的人脸姿态识别方法.通过在公开的XJTU和PIE人脸库上实验表明:所研究的方法对人脸光照、噪声和遮挡变化具有鲁棒性.  相似文献   

3.
针对在线姿势识别中来自流行深度传感器的噪声数据影响识别鲁棒性的问题,提出了一种基于姿势内核学习融合决策森林方法.首先,将使用骨架关节角表示每种姿势;然后,利用多类SVM分类器获得姿势内核;最后,利用决策森林实时标记关键姿势序列,根据关键姿势序列完成识别.实验结果表明,本方法的识别率可高达99.3%,相比几种较为先进的识别方法,本文方法具有更好的识别鲁棒性,并且在一定程度上降低了识别所耗时间.  相似文献   

4.
Human action recognition in videos is still an important while challenging task. Existing methods based on RGB image or optical flow are easily affected by clutters and ambiguous backgrounds. In this paper, we propose a novel Pose-Guided Inflated 3D ConvNet framework (PI3D) to address this issue. First, we design a spatial–temporal pose module, which provides essential clues for the Inflated 3D ConvNet (I3D). The pose module consists of pose estimation and pose-based action recognition. Second, for multi-person estimation task, the introduced pose estimation network can determine the action most relevant to the action category. Third, we propose a hierarchical pose-based network to learn the spatial–temporal features of human pose. Moreover, the pose-based network and I3D network are fused at the last convolutional layer without loss of performance. Finally, the experimental results on four data sets (HMDB-51, SYSU 3D, JHMDB and Sub-JHMDB) demonstrate that the proposed PI3D framework outperforms the existing methods on human action recognition. This work also shows that posture cues significantly improve the performance of I3D.  相似文献   

5.
6.
基于非负稀疏表示的SAR图像目标识别方法   总被引:1,自引:0,他引:1  
针对合成孔径雷达(SAR)图像目标识别中存在物体遮挡的情况,该文提出一种基于非负稀疏表示的分类方法。通过分析L0范数和L1范数最小化在求解非负稀疏表示问题上的区别,证明在一定条件下,L1范数最小化方法除了保持解的稀疏性还能得到与输入信号更加相似的原子集合,因此也更加适用于分类问题中。在运动和静止目标获取与识别(MSTAR)数据集上的识别实验结果表明,采用L1范数的非负稀疏表示分类方法能达到较好的识别性能,并且相对传统方法对存在遮挡情况下的识别问题更稳健。  相似文献   

7.
Taking fully into consideration the fact that one human action can be intuitively considered as a sequence of key poses and atomic motions in a particular order, a human action recognition method using multi-layer codebooks of key poses and atomic motions is proposed in this paper. Inspired by the dynamics models of human joints, normalized relative orientations are computed as features for each limb of human body. In order to extract key poses and atomic motions precisely, feature sequences are segmented into pose feature segments and motion feature segments dynamically, based on the potential differences of feature sequences. Multi-layer codebooks of each human action are constructed with the key poses extracted from pose feature segments and the atomic motions extracted from motion feature segments associated with each two key poses. The multi-layer codebooks represent action patterns of each human action, which can be used to recognize human actions with the proposed pattern-matching method. Three classification methods are employed for action recognition based on the multi-layer codebooks. Two public action datasets, i.e., CAD-60 and MSRC-12 datasets, are used to demonstrate the advantages of the proposed method. The experimental results show that the proposed method can obtain a comparable or better performance compared with the state-of-the-art methods.  相似文献   

8.
由于人脸图像数据的维数都较高,将稀疏表示分类用于人脸识别时计算量很大,为了提高人脸识别系统的效率,提出了一种融合半监督降维和稀疏表示的人脸识别方法。首先利用半监督降维算法对图像进行降维处理,在较低的维数空间快速取得较高的识别率,然后利用稀疏表示分类进行人脸识别,取得比传统的最近邻分类器更高的识别率,最后在ORL人脸库上进行实验验证。结果表明,利用该融合算法可快速有效地提高人脸图像的识别效果。  相似文献   

9.
数据降维是处理高维数据的有效手段。子空间学 习算法由于其计算量小,性能较为出 色而广泛应用于模式识别等领域,传统的子空间学习算法均可归纳为图嵌入算法框架中。稀 疏表达是近年来的一个研究热点,并广泛应用于信号处理和模式识别等领域,但计算复杂度 较高。在稀疏表达的基础上,研究者提出了协作表达。相比稀疏表达,协作表达算法由于其 有一个闭式解,因而计算量较小且判别性能较好,可以看成是数据表达的一种有效方法。本 文从协作表达的角度来解释图嵌入算法,将图嵌入算法看作是一类回归模型。通过最小化类 内重构误差散度的同时最大化类间重构误差散度,提出了一种新的图嵌入算法,即重构判别 分析,并将它应用于该回归模型中,然后将问题归结为一广义的特征值问题,算法在某种程 度上能有效避免子空间学习过程中矩阵的奇异性问题。在人脸识别上的实验验证了算法的正 确性和有效性。  相似文献   

10.
在人脸识别中,人脸图像受到表情、光照、遮挡、姿态变化、特别是训练样本数量的影响,而现实中经常只获得少量的训练样本,由于原始样本生成虚拟样本可以增加训练样本的数量,分析提出原始样本与轴对称样本融合的协同表示算法。首先生成镜像样本与轴对称样本,再在协同表示分类器下分类,最后加权值融合,分析不同权值下的人脸识别率。实验结果显示原始样本、镜像样本与轴对称样本融合能提高识别率,而原始样本与轴对称样本融合的识别率更加优越,较原始样本,识别率提高2%~9%,比原始样本与镜像样本融合高1%~5%。结果表明本文提出方法能有效提高人脸识别率。  相似文献   

11.
This paper presents a dynamic representation-based tracker (DRT) to handle occlusions in the long-term pedestrian tracking of a single target. In our DRT, an adaptive representation network (ARN) is first constructed to extract multiple features, including classical features such as appearance and pose as well as some vector-format deep features. These features are then stacked to form a dynamic representation so as to convert the target tracking into a matching problem between the target features and candidate features, where the Euclidean distance (ED) and locality-constrained linear coding (LLC) are used as measurements in the decision-making. Next, the target state is determined through a voting procedure according to the feature matching error. Finally, a pose supervised module (PSM) and an IOU filtering module (IFM) are applied, respectively, to refine the target state and to filter out some invalid candidate targets that have been detected. Experimental results on public benchmark datasets show that our DRT is quite robust to complex environments with long-term pedestrian occlusions, and outperforms several existing state-of-the-arts trackers as it produces the best performance on both the pedestrian tracking dataset with occlusion (PTDO) and the pedestrian tracking dataset with occlusion plus (PTDO Plus).  相似文献   

12.
针对人脸识别算法对光照变化敏感的问题,提出一种基于光照鲁棒稀疏表示的人脸识别方法。该方法对图像作小波变换,得到光照归一化图像,通过对光照归一化后人脸图像作稀疏变换,稀疏表示分类得出测试识别结果。本文方法在Yale B人脸库上仿真实验,识别率较高,对光照、表情、遮挡具有一定的鲁棒性。  相似文献   

13.
针对加速度传感器的手势采集方式提出一种基于自学习稀疏表示的动态手势识别方法。该方法将分类识别问题转化为求解待识别样本对于训练样本的稀疏表示问题,直接对原始加速度信号进行操作,省去了特征提取过程,可方便地添加新的手势类别和删除已有的手势类别;利用面向类别的字典学习,来寻求一个较小的并经过优化的超完备字典来计算待识别样本的稀疏表示,从而大大缩减算法的计算复杂度,满足实时性要求。在包含18种手势的3 000多个样本的公开数据集上进行测试,实验结果验证了该方法的有效性。  相似文献   

14.
Recently, various sparse representation methods have been successfully used in multi-focus image fusion. Most of them produce some spatial artifacts and blurring effects because they only consider the image local information due to the patch processing strategy. In order to reduce the spatial artifacts and blurring effects on the edge details and improve the robustness of the multi-focus image fusion, a novel fusion method based on joint convolutional analysis and synthesis (JCAS) sparse representation is presented. The JCAS model, which integrates the analysis sparse representation and the synthesis sparse representation by using convolutional operation, can effectively separate large-scale structures and fine-scale textures of a single image. First, each source image is decomposed into a base layer and a detail layer using the JCAS model. Second, a Laplacian pyramid transform method is used to fuse the base layers, and a weighted regional energy method is used to fuse the detail layers. Finally, the fused image is reconstructed by combining the fused base and detail layers. Experimental results demonstrate that the proposed method can obtain clearer edge details compared with some popular multi-focus image fusion methods, thus exhibiting state-of-the-art performance in terms of both visual quality and objective assessment.  相似文献   

15.
卢纯青  杨孟飞  武延鹏  梁潇 《红外与激光工程》2020,49(1):0113005-0113005(9)
深空探测器的功耗和体积有限,任务工况多样,与低轨道地球探测器相比,深空探测器对导航敏感器的任务能力提出了更高的需求。提出了一种基于飞行时间成像的快速位姿测量和地物目标识别技术。为了在保证位姿测量精度的前提下满足对位姿测量时间性能的需求,提出了一种基于深度信息的动态尺度估计方法。该方法提升了物方多尺度变化条件下点云配准的时间稳定性,平均配准时间缩短60%以上,平均配准精度约为0.04 m。为了满足多尺度、多形态地物目标识别的需求,使用了基于轻量化深度神经网络,可根据场景深度信息进行地物检测。结果表明,该方法可对地物特征进行快速感知,在真实场景中的准确率达到70%以上。  相似文献   

16.
In this paper, we present a new approach for dynamic hand gesture recognition. Our goal is to integrate spatiotemporal features extracted from multimodal data captured by the Kinect sensor. In case the skeleton data is not provided, we apply a novel skeleton estimation method to compute temporal features. Furthermore, we introduce an effective method to extract a fixed number of keyframes to reduce the processing time. To extract pose features from RGB-D data, we take advantage of two different approaches: (1) Convolutional Neural Networks and (2) Histogram of Cumulative Magnitudes. We test different integration methods to fuse the extracted spatiotemporal features to boost recognition performance in a linear SVM classifier. Extensive experiments prove the effectiveness and feasibility of the proposed framework for hand gesture recognition.  相似文献   

17.
在人脸识别中,人脸图像往往受到表情、光照、遮挡、姿态变化的影响,对此本文提出一种基于低秩特征脸与协同表示的人脸识别算法。该算法先用低秩矩阵恢复算法分解出训练样本图像的误差图像,再分别对训练样本与误差图像提取特征构造特征字典,计算测试样本图像特征字典下的协同表示系数,最后通过重构误差进行分类。通过AR和ORL人脸库进行实验,结果表明,本文提出的人脸识别算法的识别率、识别速率得到有效提高。  相似文献   

18.
19.
为了克服核稀疏表示分类(KSRC)算法无法获取数据的局部性信息从而导致获取的稀疏表示系数判别性受到限制的不足,提出一种局部敏感的KSRC(LS-KSRC)算法用于人脸识别。通过在核特征空间中同时集成稀疏性和数据局部性信息,从而获取具有良好判别性的用于分类的稀疏表示系数。在标准的ORL人脸数据库和Extended Yale B人脸数据库的试验结果表明,本文方法的分类性能优于传统的(KSRC)算法、稀疏表示分类(SRC)算法、局部线性约束编码(LLC)、支持向量机(SVM)、最近邻法(NN)以及最近邻子空间法(NS),用于人脸识别能够取得优越的分类性能。  相似文献   

20.
场景识别是计算机视觉研究中的一项基本任务.与图像分类不同,场景识别需要综合考虑场景的背景信息、局部场景特征以及物体特征等因素,导致经典卷积神经网络在场景识别上性能欠佳.为解决此问题,文中提出了一种基于深度卷积特征的场景全局与局部表示方法.此方法对场景图片的卷积特征进行变换从而为每张图片生成一个综合的特征表示.使用CAM...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号