首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In this paper, we utilize a line based pose representation to recognize human actions in videos. We represent the pose in each frame by employing a collection of line-pairs, so that limb and joint movements are better described and the geometrical relationships among the lines forming the human figure are captured. We contribute to the literature by proposing a new method that matches line-pairs of two poses to compute the similarity between them. Moreover, to encapsulate the global motion information of a pose sequence, we introduce line-flow histograms, which are extracted by matching line segments in consecutive frames. Experimental results on Weizmann and KTH datasets emphasize the power of our pose representation, and show the effectiveness of using pose ordering and line-flow histograms together in grasping the nature of an action and distinguishing one from the others.  相似文献   

2.
Human actions can be considered as a sequence of body poses over time, usually represented by coordinates corresponding to human skeleton models. Recently, a variety of low-cost devices have been released, able to produce markerless real time pose estimation. Nevertheless, limitations of the incorporated RGB-D sensors can produce inaccuracies, necessitating the utilization of alternative representation and classification schemes in order to boost performance. In this context, we propose a method for action recognition where skeletal data are initially processed in order to obtain robust and invariant pose representations and then vectors of dissimilarities to a set of prototype actions are computed. The task of recognition is performed in the dissimilarity space using sparse representation. A new publicly available dataset is introduced in this paper, created for evaluation purposes. The proposed method was also evaluated on other public datasets, and the results are compared to those of similar methods.  相似文献   

3.
Much of the existing work on action recognition combines simple features with complex classifiers or models to represent an action. Parameters of such models usually do not have any physical meaning nor do they provide any qualitative insight relating the action to the actual motion of the body or its parts. In this paper, we propose a new representation of human actions called sequence of the most informative joints (SMIJ), which is extremely easy to interpret. At each time instant, we automatically select a few skeletal joints that are deemed to be the most informative for performing the current action based on highly interpretable measures such as the mean or variance of joint angle trajectories. We then represent the action as a sequence of these most informative joints. Experiments on multiple databases show that the SMIJ representation is discriminative for human action recognition and performs better than several state-of-the-art algorithms.  相似文献   

4.
5.
人体动作可以由人体不同局部区域的动作语义的组合来描述,由此提出了一种基于局部语义的人体动作识别方法。首先,该方法定义了一组局部动作语义用于描述人体局部区域运动的视觉表现,并对每一个局部语义进行建模。然后,通过这些局部动作语义的贡献值组合来进行构建动作表征。最后,将基于局部动作语义的动作表征输入支持向量机构建动作模型,进行动作分类。比较实验说明所提出方法能够较好地识别现实场景下的人体动作。  相似文献   

6.
Constructing the bag-of-features model from Space–time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions.  相似文献   

7.
Due to the exponential growth of the video data stored and uploaded in the Internet websites especially YouTube, an effective analysis of video actions has become very necessary. In this paper, we tackle the challenging problem of human action recognition in realistic video sequences. The proposed system combines the efficiency of the Bag-of-visual-Words strategy and the power of graphs for structural representation of features. It is built upon the commonly used Space–Time Interest Points (STIP) local features followed by a graph-based video representation which models the spatio-temporal relations among these features. The experiments are realized on two challenging datasets: Hollywood2 and UCF YouTube Action. The experimental results show the effectiveness of the proposed method.  相似文献   

8.
随着视频数据的海量增长,在人体动作识别领域,单一特征的运用已经不能满足现有的对复杂动作,复杂环境的识别问题。基于此,提出了一种利用多示例将多种动作特征融合来识别人体动作的方法,通过利用传统的多示例学习中包的概念,将同一个样本的不同的特征表征作为在同一个包下的示例.将同一类动作的所有包作为正包,其它种类的动作作为负包,来学习模型进行分类。通过在常用数据库上的测试取得了较好的结果。  相似文献   

9.
In this paper a new classification method called locality-sensitive kernel sparse representation classification (LS-KSRC) is proposed for face recognition. LS-KSRC integrates both sparsity and data locality in the kernel feature space rather than in the original feature space. LS-KSRC can learn more discriminating sparse representation coefficients for face recognition. The closed form solution of the l1-norm minimization problem for LS-KSRC is also presented. LS-KSRC is compared with kernel sparse representation classification (KSRC), sparse representation classification (SRC), locality-constrained linear coding (LLC), support vector machines (SVM), the nearest neighbor (NN), and the nearest subspace (NS). Experimental results on three benchmarking face databases, i.e., the ORL database, the Extended Yale B database, and the CMU PIE database, demonstrate the promising performance of the proposed method for face recognition, outperforming the other used methods.  相似文献   

10.
Gabor wavelet representation for 3-D object recognition   总被引:8,自引:0,他引:8  
This paper presents a model-based object recognition approach that uses a Gabor wavelet representation. The key idea is to use magnitude, phase, and frequency measures of the Gabor wavelet representation in an innovative flexible matching approach that can provide robust recognition. The Gabor grid, a topology-preserving map, efficiently encodes both signal energy and structural information of an object in a sparse multiresolution representation. The Gabor grid subsamples the Gabor wavelet decomposition of an object model and is deformed to allow the indexed object model match with similar representation obtained using image data. Flexible matching between the model and the image minimizes a cost function based on local similarity and geometric distortion of the Gabor grid. Grid erosion and repairing is performed whenever a collapsed grid, due to object occlusion, is detected. The results on infrared imagery are presented, where objects undergo rotation, translation, scale, occlusion, and aspect variations under changing environmental conditions.  相似文献   

11.
In this paper, we learn explicit representations for dynamic shape manifolds of moving humans for the task of action recognition. We exploit locality preserving projections (LPP) for dimensionality reduction, leading to a low-dimensional embedding of human movements. Given a sequence of moving silhouettes associated to an action video, by LPP, we project them into a low-dimensional space to characterize the spatiotemporal property of the action, as well as to preserve much of the geometric structure. To match the embedded action trajectories, the median Hausdorff distance or normalized spatiotemporal correlation is used for similarity measures. Action classification is then achieved in a nearest-neighbor framework. To evaluate the proposed method, extensive experiments have been carried out on a recent dataset including ten actions performed by nine different subjects. The experimental results show that the proposed method is able to not only recognize human actions effectively, but also considerably tolerate some challenging conditions, e.g., partial occlusion, low-quality videos, changes in viewpoints, scales, and clothes; within-class variations caused by different subjects with different physical build; styles of motion; etc.  相似文献   

12.
Human action recognition typically requires a large amount of training samples, which is often expensive and time-consuming to create. In this paper, we present a novel approach for enhancing human actions with a limited number of samples via structural average curves analysis. Our approach first learns average sequences from each pair of video samples for every action class and then gather them with original video samples together to form a new training set. Action modeling and recognition are proposed to be performed with the resulting new set. Our technique was evaluated on four benchmarking datasets. Our classification results are superior to those obtained with the original training sets, which suggests that the proposed method can potentially be integrated with other approaches to further improve their recognition performances.  相似文献   

13.
行为识别是计算机视觉领域的一个重要研究方向,已被广泛应用于视频监控、人群分析、人机交互、虚拟现实等领域.而时空建模是视频行为识别的一个重要部分,有效地进行时空建模可以极大地提高行为识别的精度.现有的先进算法采用3D CNN学习强大的时空表示,但在计算上是复杂的,这也使得相关部署昂贵;此外,改进的具有时间迁移操作的2D CNN算法也被用来进行时空建模,这种算法通过沿时间维度移动一部分特征通道用以进行高效的时序建模.然而,时间迁移操作不允许自适应地重新加权时空特征.以前的工作没有考虑将这两种方法结合利用起来,取长补短,以便更好地建模时空特征.本文提出了一个协作网络用以有效地结合3D CNN和2D卷积形式的时间迁移模块.特别是一个新的嵌入注意力机制的协同时空模块(Collaborative Spatial-temporal module,CSTM)被提出用以有效的学习时空特征.本文在与时序相关的数据集(Something-Something v1,v2,Jester)上验证了该算法的有效性,并且获得了竞争性的性能.  相似文献   

14.
Multidimensional Systems and Signal Processing - Face recognition is an important topic in the field of computer vision and has been a vital biometric technique for identity authentication. It is...  相似文献   

15.
In this paper, we propose Learned Local Gabor Patterns (LLGP) for face representation and recognition. The proposed method is based on Gabor feature and the concept of texton, and defines the feature cliques which appear frequently in Gabor features as the basic patterns. Different from Local Binary Patterns (LBP) whose patterns are predefined, the local patterns in our approach are learned from the patch set, which is constructed by sampling patches from Gabor filtered face images. Thus, the patterns in our approach are face-specific and desirable for face perception tasks. Based on these learned patterns, each facial image is converted into multiple pattern maps and the block-based histograms of these patterns are concatenated together to form the representation of the face image. In addition, we propose an effective weighting strategy to enhance the performances, which makes use of the discriminative powers of different facial parts as well as different patterns. The proposed approach is evaluated on two face databases: FERET and CAS-PEAL-R1. Extensive experimental results and comparisons with existing methods show the effectiveness of the LLGP representation method and the weighting strategy. Especially, heterogeneous testing results show that the LLGP codebook has very impressive generalizability for unseen data.  相似文献   

16.
《信息技术》2018,(3):27-33
根据人体动作具有模糊性的特点,提出一种基于加速度信号与模糊综合评价原理的人体动作识别方法。根据样本的加速度向量幅值(SVM)划分为剧烈动作和轻微动作两种模式,通过人体腰部三维加速度数据采集与预处理、建立评价指标体系和标准矩阵、设计隶属度函数并计算隶属度、建立模糊关系矩阵、确定评价因子的权向量等步骤,最终合成模糊综合评价结果。分别利用支持向量机(SVM)、决策树(Decision Tree)和k近邻算法(KNN)这三种常用的分类方法对样本数据的频域特征和时域特征进行评价并与文中提出的方法进行对比,实验结果表明该方法具有计算量小,系统性强,识别准确率高等特点,适用于多种环境下不同剧烈程度的动作识别。  相似文献   

17.
《现代电子技术》2019,(4):80-84
针对人脸人耳融合识别算法对图像光照变化、表情变化、拍摄角度变化等鲁棒性不强的问题,将核稀疏表示理论引入到人脸人耳融合识别中,提出基于核稀疏表示的人脸人耳融合识别算法。新算法采用的是能有效降低样本维度的PCA特征提取算法,人脸人耳的特征融合层级选用既能实现冗余信息有效压缩,又能最大程度利用不同模态生物特征可区分性的特征级融合。考虑到不同模态生物特征对最终识别的贡献可能有所不同,该算法采用加权串联融合法,同时测试样本在训练样本中稀疏表示系数的求解采用的是迭代速度比较快的正交匹配追踪算法。与其他识别算法相比,该算法具有非常好的识别性能,并且对人脸人耳图像变化具有很强的鲁棒性。  相似文献   

18.
This paper proposes a discriminative low-rank representation (DLRR) method for face recognition in which both the training and test samples are corrupted owing to variations in occlusion and disguise. The proposed method extends the sparse representation-based classification algorithm by incorporating the low-rank structure of data representation. The DLRR algorithm recovers a clean dictionary with enhanced discrimination ability from the corrupted training samples for sparse representation. Simultaneously, it learns a low-rank projection matrix to correct corrupted test samples by projecting them onto their corresponding underlying subspaces. The dictionary elements from different classes are encouraged to be as independent as possible by regularizing the structural incoherence of the original training samples. This leads to a compact representation of a corrected test sample by a linear combination of more dictionary elements from the corrected class. The experimental results on benchmark databases show the effectiveness and robustness of our face recognition technique.  相似文献   

19.
In this paper, we propose a novel approach for key frames extraction on human action recognition from 3D video sequences. To represent human actions, an Energy Feature (EF), combining kinetic energy and potential energy, is extracted from 3D video sequences. A Self-adaptive Weighted Affinity Propagation (SWAP) algorithm is then proposed to extract the key frames. Finally, we employ SVM to recognize human actions on the EFs of selected key frames. The experiments show the information including whole action course can be effectively extracted by our method, and we obtain good recognition performance without losing classification accuracy. Moreover, the recognition speed is greatly improved.  相似文献   

20.
传统的基于稀疏表示的人脸识别方法是基于人脸的整体特征的,这类方法要求每位测试者的人脸图像要有足够多幅,而且特征维度高,计算复杂,针对这一问题,提出一种基于离散余弦变换和稀疏表示的人脸识别方法,对人脸图像进行分块采样,对采样样本使用离散余弦变换和稀疏分解,然后使用一种类似于词袋的方法得到整幅图像的特征向量,最后使用相似度比较的方法进行分类识别。实验表明,在此提出的方法比传统的基于稀疏表示的人脸识别方法在训练样本较少时效果更好。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号