首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
人体行为识别中的一个关键问题是如何表示高维的人体动作和构建精确稳定的人体分类模型.文中提出有效的基于混合特征的人体行为识别算法.该算法融合基于外观结构的人体重要关节点极坐标特征和基于光流的运动特征,可更有效获取视频序列中的运动信息,提高识别即时性.同时提出基于帧的选择性集成旋转森林分类模型(SERF),有效地将选择性集成策略融入到旋转森林基分类器的选择中,从而增加基分类器之间的差异性.实验表明SERF模型具有较高的分类精度和较强的鲁棒性.  相似文献   

研究从静止图像中识别人体姿态动作.首先提出层次部件树结构,树中每个节点由一组Poselet表示该肢体部件的姿态变化,节点之间相互制约,构成一个Pictorial结构.基于此结构,提出基于层次部件树结构的动作识别判决模型.Pictorial结构的对偶潜在函数中除了变形代价,引入Poselet同时出现代价.由于树的邻接节点之间存在包含关系,相对位置可以使用高斯分布描述,推理过程沿用距离转换和置信度传播算法,实现高效匹配.在2个数据集上,对剪枝后节点数量不同的3种判决模型的实验表明,前两层的粗粒度节点具有较强的动作识别显著性,第三层进一步提高动作识别能力,第四层的原子部件对动作识别无明显作用.  相似文献   

陈伟  李杭  李维华 《计算机科学》2022,49(2):285-291
核小体定位指DNA双螺旋相对于组蛋白的位置,并在DNA的转录阶段起着重要的调节作用.依靠生物实验的手段测得核小体定位会消耗大量的时间和资源,因此基于计算方法利用DNA序列进行核小体定位预测成为了一个重要的研究方向.针对核小体定位预测中单一模型和单一编码在DNA序列特征表示和学习方面的不足,文中提出了一种端到端的集成深度...  相似文献   

Image Region Selection and Ensemble for Face Recognition   总被引:2,自引:0,他引:2       下载免费PDF全文
In this paper, a novel framework for face recognition, namely Selective Ensemble of Image Regions (SEIR), is proposed. In this framework, all possible regions in the face image are regarded as a certain kind of features. There are two main steps in SEIR: the first step is to automatically select several regions from all possible candidates; the second step is to construct classifier ensemble from the selected regions. An implementation of SEIR based on multiple eigenspaces, namely SEME, is also proposed in this paper. SEME is analyzed and compared with eigenface, PCA + LDA, eigenfeature, and eigenface + eigenfeature through experiments. The experimental results show that SEME achieves the best performance.  相似文献   

A Region Ensemble for 3-D Face Recognition   总被引:1,自引:0,他引:1  
In this paper, we introduce a new system for 3D face recognition based on the fusion of results from a committee of regions that have been independently matched. Experimental results demonstrate that using 28 small regions on the face allow for the highest level of 3D face recognition. Score-based fusion is performed on the individual region match scores and experimental results show that the Borda count and consensus voting methods yield higher performance than the standard sum, product, and min fusion rules. In addition, results are reported that demonstrate the robustness of our algorithm by simulating large holes and artifacts in images. To our knowledge, no other work has been published that uses a large number of 3D face regions for high-performance face matching. Rank one recognition rates of 97.2% and verification rates of 93.2% at a 0.1% false accept rate are reported and compared to other methods published on the face recognition grand challenge v2 data set.  相似文献   

International Journal of Computer Vision - This paper strives for spatio-temporal localization of human actions in videos. In the literature, the consensus is to achieve localization by training on...  相似文献   

陈略  熊宸  蔡铭 《计算机工程》2021,47(3):83-93
手机信令具有时空序列性以及数据量大、采样频率不均、定位精度低与基站振荡等特点,导致传统手机信令聚类方法数据密度分布不均、时空开销大且聚类效果差.提出一种用于手机信令的时空密度轨迹点识别算法.将手机信令数据网格化以统一评估尺度,根据振荡噪声特征对网格簇进行时空联结减少空间不确定性和计算量,结合网络轨迹的曲折性以及移动与停...  相似文献   

基于场景识别的移动机器人定位方法研究   总被引:8,自引:0,他引:8  
提出了一种基于场景识别的移动机器人定位方法.对CCD采集的工作环境的系列场景图像,用多通道Gabor 滤波器提取场景图像的全局纹理特征,然后通过SVM分类器来识别场景图像,实现机器人的逻辑定位.在移动机器人CASIA-I 上对该算法进行了实验.实验结果表明,该定位方法可达到91.11%的定位准确率,对光照、对比度等因素有较强的鲁棒性,并且满足机器人实时定位的要求.  相似文献   

粗糙RBF神经网络集成的模式识别方法   总被引:1,自引:0,他引:1  
提出一种定义属性重要度的方法,并根据属性的重要度测量元素之间的距离,以确定训练集的聚类情况.由于聚类的不确定性,提出利用粗糙集方法确定精确的下、上近似集合,用其聚类中心作为RBF神经网络的径向基中心,设计两个基函数中心不同的RBF神经网络.最后在经验风险最小化原则下,确定两个网络的每个输出值的置信度,得到神经网络集成的最终输出.网络的训练采用递推最小二乘方法,通过两个模式识别仿真实例验证该方法的有效性和正确性.  相似文献   

In this paper, we study a novel approach to spoken language recognition using an ensemble of binary classifiers. In this framework, we begin by representing a speech utterance with a high-dimensional feature vector such as the phonotactic characteristics or the polynomial expansion of cepstral features. A binary classifier can be built based on such feature vectors. We adopt a distributed output coding strategy in ensemble classifier design, where we decompose a multiclass language recognition problem into many binary classification tasks, each of which addresses a language recognition subtask by using a component classifier. Then, we combine the results of the component classifiers to form an output code as a hypothesized solution to the overall language recognition problem. In this way, we effectively project high-dimensional feature vectors into a tractable low-dimensional space, yet maintaining language discriminative characteristics of the spoken utterances. By fusing the output codes from both phonotactic features and cepstral features, we achieve equal-error-rates of 1.38% and 3.20% for 30-s trials on the 2003 and 2005 NIST language recognition evaluation databases.  相似文献   

International Journal of Computer Vision - Deep learning models for video-based action recognition usually generate features for short clips (consisting of a few frames); such clip-level features...  相似文献   

View Invariance for Human Action Recognition   总被引:4,自引:0,他引:4  
This paper presents an approach for viewpoint invariant human action recognition, an area that has received scant attention so far, relative to the overall body of work in human action recognition. It has been established previously that there exist no invariants for 3D to 2D projection. However, there exist a wealth of techniques in 2D invariance that can be used to advantage in 3D to 2D projection. We exploit these techniques and model actions in terms of view-invariant canonical body poses and trajectories in 2D invariance space, leading to a simple and effective way to represent and recognize human actions from a general viewpoint. We first evaluate the approach theoretically and show why a straightforward application of the 2D invariance idea will not work. We describe strategies designed to overcome inherent problems in the straightforward approach and outline the recognition algorithm. We then present results on 2D projections of publicly available human motion capture data as well on manually segmented real image sequences. In addition to robustness to viewpoint change, the approach is robust enough to handle different people, minor variabilities in a given action, and the speed of aciton (and hence, frame-rate) while encoding sufficient distinction among actions. This work was done when the author was a graduate student in the Department of Computer Science and was partially supported by the NSF Grant ECS-02-5475. The author is curently with Siemens Corporate Research, Princeton, NJ. Dr. Chellappa is with the Department of Electrical and Computer Engineering.  相似文献   

基于动作串的人体行为识别   总被引:1,自引:0,他引:1  
赵海勇  李俊青 《计算机科学》2013,40(10):296-300
提出了一种以运动人体侧影为特征的基于模板匹配的人体行为识别方法.首先,利用背景差分法和阴影消除技术提取运动人体侧影.利用缓变换对人体侧影进行特征提取,将时变的2D区域形状转换为对应的1D距离向量.然后,利用谱系聚类方法提取动作序列的关键姿态,将关键姿态编码为称为动作串的模板.最后,利用动态时间规整算法度量测试序列与标准模板之间的相似性.实验结果表明,本方法对人的6种日常行为进行识别的正确识别率达到85%以上,具有简单实用的特点.  相似文献   

为了克服偏标记学习中监督信息缺失的问题,根据偏标记样本的性质设计决策树生成过程中的样本分裂规则,改造决策树的建立算法.文中算法首先对样本进行bootstrap采样并建立多棵决策树,然后对各决策树结果进行投票得出最终预测结果.在人工数据集和真实数据集上的实验表明,文中算法具有较好的分类性能.  相似文献   

《Advanced Robotics》2013,27(6-7):871-891
In robotics, there has been a growing interest in expressing actions as a combination of meaningful subparts commonly called motion primitives. Primitives are analogous to words in a language. Similar to words put together according to the rules of language in a sentence, primitives arranged with certain rules make an action. In this paper we investigate modeling and recognition of arm manipulation actions at different levels of complexity using primitives. Primitives are detected automatically in a sequential manner. Here, we assume no prior knowledge on primitives, but look for correlating segments across various sequences. All actions are then modeled within a single hidden Markov models whose structure is learned incrementally as new data is observed. We also generate an action grammar based on these primitives and thus link signals to symbols.  相似文献   

How much does knowledge regarding a certain spoken word or phrase help with its localization? This is a very fundamental question for speech processing, and will be partially addressed in this paper. In particular, this work will utilize prior information regarding the contents of a speech signal in order to improve the artificial localization of it using Time delay of arrival (TDOA) between two microphones. The prior information, which is used to develop a very simple frequency-selective phase transform (FPT), increases the effective SNR by only using a subset of the highest SNR frequencies in the Phase Transform. Simulations in a reverberant environment show that the proposed approach can more robustly and accurately localize speech sources. For 20 ms signal segments, it is shown that using a subset of 45 percent of available speech frequency bins is superior to using 30, 60, or 100, where using 100 corresponds to the standard Phase Transform.  相似文献   

活体虹膜图像的定位与分割   总被引:2,自引:0,他引:2  
介绍了一种活体虹膜的定位与分割算法。算法主要分为两部分:圆环的定位与非虹膜区域的去除。本算法根据眼睛的生理特点和数字虹膜图像的实际情况,利用传统定位方法与数学形态学相结合对虹膜区域进行快速而准确的定位,并分别提出了去除眼睑、睫毛和光斑影响的解决方案。算法中也考虑到实际应用可能遇到的影响虹膜定位与分割的问题。实验表明,该算法取得较好的分割结果,并且具有鲁棒性。  相似文献   

基于多学习器协同训练模型的人体行为识别方法   总被引:1,自引:0,他引:1  
唐超  王文剑  李伟  李国斌  曹峰 《软件学报》2015,26(11):2939-2950
人体行为识别是计算机视觉研究的热点问题,现有的行为识别方法都是基于监督学习框架.为了取得较好的识别效果,通常需要大量的有标记样本来建模.然而,获取有标记样本是一个费时又费力的工作.为了解决这个问题,对半监督学习中的协同训练算法进行改进,提出了一种基于多学习器协同训练模型的人体行为识别方法.这是一种基于半监督学习框架的识别算法.该方法首先通过基于Q统计量的学习器差异性度量选择算法来挑取出协同训练中基学习器集,在协同训练过程中,这些基学习器集对未标记样本进行标记;然后,采用了基于分类器成员委员会的标记近邻置信度计算公式来评估未标记样本的置信度,选取一定比例置信度较高的未标记样本加入到已标记的训练样本集并更新学习器来提升模型的泛化能力.为了评估算法的有效性,采用混合特征来表征人体行为,从而可以快速完成识别过程.实验结果表明,所提出的基于半监督学习的行为识别系统可以有效地辨识视频中的人体动作.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号