首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
人脸识别是计算机视觉领域的研究热点,应用背景广泛。近年来,流形被认为是视觉感知的基础,流形学习算法被用来发现图像的内在特征。如何利用流形学习后的低维内蕴变量成为相关研究的核心问题。但是利用传统的流形学习算法降维得到的人脸低维特征在可分性上存在一定的不足。此外,流形学习算法对光照和姿态变化敏感。针对这两个问题,提出了一种基于局部二值模式(LBP)和流形知识的人脸识别方法。该方法首先利用LBP算子对人脸图像进行局部特征描述,然后使用流形学习算法获得高维特征数据的低维内蕴变量,并用泰勒展开式近似该流形,获取流形知识,最后利用流形知识估计流形距离来实现人脸识别。实验证明,该方法增强了人脸识别对光照变化的鲁棒性,从而提高了识别性能。  相似文献   

人脸空间是嵌套在高维观测空间中的低维流形,为了更好地描述人脸空间的凸起和凹进等细微结构,提出了一种基于二维测地线距离保持映射的人脸识别算法。算法采用矩阵的模式表示人脸空间中的样本图像;基于图像的矩阵表示模型,采用二维测地线距离保持映射算法计算人脸空间的低维嵌套流形;以人脸样本在低维流形空间中的投影为特征进行人脸识别。在CMU PIE人脸数据库上的实验结果验证了算法的合理性和有效性。  相似文献   

从语音信号声学特征空间的非线性流形结构特点出发, 利用流形上的压缩感知原理, 构建新的语音识别声学模型. 将特征空间划分为多个局部区域, 对每个局部区域用一个低维的因子分析模型进行近似, 从而得到混合因子分析模型. 将上下文相关状态的观测矢量限定在该非线性低维流形结构上, 推导得到其观测概率模型. 最终, 每个状态由一个服从稀疏约束的权重矢量和若干个服从标准正态分布的低维局部因子矢量所决定. 文中给出了局部区域潜在维数的确定准则及模型参数的迭代估计算法. 基于RM语料库的连续语音识别实验表明, 相比于传统的高斯混合模型(Gaussian mixture model, GMM)和子空间高斯混合模型(Subspace Gaussian mixture model, SGMM), 新声学模型在测试集上的平均词错误率(Word error rate, WER)分别相对下降了33.1%和9.2%.  相似文献   

In the past few years, the computer vision and pattern recognition community has witnessed a rapid growth of a new kind of feature extraction method, the manifold learning methods, which attempt to project the original data into a lower dimensional feature space by preserving the local neighborhood structure. Among these methods, locality preserving projection (LPP) is one of the most promising feature extraction techniques. Unlike the unsupervised learning scheme of LPP, this paper follows the supervised learning scheme, i.e. it uses both local information and class information to model the similarity of the data. Based on novel similarity, we propose two feature extraction algorithms, supervised optimal locality preserving projection (SOLPP) and normalized Laplacian-based supervised optimal locality preserving projection (NL-SOLPP). Optimal here means that the extracted features via SOLPP (or NL-SOLPP) are statistically uncorrelated and orthogonal. We compare the proposed SOLPP and NL-SOLPP with LPP, orthogonal locality preserving projection (OLPP) and uncorrelated locality preserving projection (ULPP) on publicly available data sets. Experimental results show that the proposed SOLPP and NL-SOLPP achieve much higher recognition accuracy.  相似文献   

目的 针对高分辨率遥感影像普遍存在的同谱异物和同物异谱问题,提出一种综合利用光谱、形状、空间上下文和纹理特征的建筑物分级提取方法。方法 该方法基于单幅高分辨率遥感影像,首先利用多尺度多方向梯度算子构造的建筑物指数和形状特征提取部分分割完整的矩形建筑物目标;然后由多方向线性结构元素和形态学膨胀运算确定投票矩阵,从而获取光照方向,并利用光照方向和阴影特征对已提取建筑物进行筛选,剔除非建筑物对象,完成建筑物初提取;最后借助初提取建筑物对象的纹理特征向量建立概率模型,取得像素级建筑物提取结果,将该结果与影像分割相结合实现建筑物提取。结果 选取两幅高分辨率遥感影像进行实验,在建筑物初提取实验中,将本文方法与邻域总变分法和Sobel算子进行对比,实验结果表明,本文方法适用性强,为后提取提供的建筑物样本可靠性更高。在建筑物提取实验中,采用查准率、查全率和F1分数3个指标进行定量分析,与形态学建筑物指数结合形态学阴影指数算法、邻域总变分结合混合高斯模型和贝叶斯判决算法相比,各项精度指标均得到显著提升,其中查准率提高了2.90个百分点,查全率提高了12.49个百分点,F1分数则提升了8.84。结论 本文提出的建筑物分级提取方法具备一定抗干扰能力,且提取准确性高,适用性强。  相似文献   

This paper proposes a novel method based on Spectral Regression (SR) for efficient scene recognition. First, a new SR approach, called Extended Spectral Regression (ESR), is proposed to perform manifold learning on a huge number of data samples. Then, an efficient Bag-of-Words (BOW) based method is developed which employs ESR to encapsulate local visual features with their semantic, spatial, scale, and orientation information for scene recognition. In many applications, such as image classification and multimedia analysis, there are a huge number of low-level feature samples in a training set. It prohibits direct application of SR to perform manifold learning on such dataset. In ESR, we first group the samples into tiny clusters, and then devise an approach to reduce the size of the similarity matrix for graph learning. In this way, the subspace learning on graph Laplacian for a vast dataset is computationally feasible on a personal computer. In the ESR-based scene recognition, we first propose an enhanced low-level feature representation which combines the scale, orientation, spatial position, and local appearance of a local feature. Then, ESR is applied to embed enhanced low-level image features. The ESR-based feature embedding not only generates a low dimension feature representation but also integrates various aspects of low-level features into the compact representation. The bag-of-words is then generated from the embedded features for image classification. The comparative experiments on open benchmark datasets for scene recognition demonstrate that the proposed method outperforms baseline approaches. It is suitable for real-time applications on mobile platforms, e.g. tablets and smart phones.  相似文献   

在数据稀疏、数据非均匀分布和数据流形具有较大曲率的情况下,传统的局部切空间方法不能够有效地揭示流形结构。提出了一种泛化的ILTSA(GILTSA)流形学习方法,该方法以改进的局部切空间排列算法(ILTSA)为基础,在解决流形结构问题的同时,不仅能够获得用于人脸识别更好的低维特征,而且能有效地处理日益增加的数据集的问题。该方法首先基于样品间距离选择近邻集,实现训练集的低维流形,为每个新样本寻找最近的样本训练集。然后结合ILTSA算法,根据其最近样本投影距离计算低维流形。在ORL的人脸图像数据库的实验、Swiss roll和手书的“2”等实验结果表明,与局部线性嵌入和局部切空间排列算法等相比,GILTSA方法增加了整体精度。  相似文献   

针对LLE算法在数据密度变化较大时很难降维的问题,提出一种基于密度刻画的降维算法。采用cam分布寻找数据点的近邻,并在低维局部重建时对数据点加入密度信息。对手写体数字图像进行字符特征的降维,再对降维后的特征进行分类识别。实验结果表明,该方法能区分字符,具有较好的识别率,能够发现高维空间的低维嵌入流形。  相似文献   

针对现有的人体行为识别算法不能充分利用网络多层次时空信息的问题,提出了一种基于三维残差稠密网络的人体行为识别算法。首先,所提算法使用三维残差稠密块作为网络的基础模块,模块通过稠密连接的卷积层提取人体行为的层级特征;其次,经过局部特征聚合自适应方法来学习人体行为的局部稠密特征;然后,应用残差连接模块来促进特征信息流动以及减轻训练的难度;最后,通过级联多个三维残差稠密块实现网络多层局部特征提取,并使用全局特征聚合自适应方法学习所有网络层的特征用以实现人体行为识别。设计的网络算法在结构上增强了对网络多层次时空特征的提取,充分利用局部和全局特征聚合学习到更具辨识力的特征,增强了模型的表达能力。在基准数据集KTH和UCF-101上的大量实验结果表明,所提算法的识别率(top-1精度)分别达到了93.52%和57.35%,与三维卷积神经网络(C3D)算法相比分别提升了3.93和13.91个百分点。所提算法框架有较好的鲁棒性和迁移学习能力,能够有效地处理多种视频行为识别任务。  相似文献   

为更好提取识别的人脸特征,文章将非线性流形学习方法LLE提取的局部非线性特征与监督学习方法LDA提取的全局线性特征相结合,利用特征融合的思想,得出有利特征,进行人脸识别。经实验证明,该方法能显著提高人脸识别系统的性能。  相似文献   

钟明  薛惠锋 《测控技术》2010,29(12):18-21
通过Garbor小波提取人脸表情特征,为降低Garbor变换后向量维数和提取有效的鉴别特征,将手动选取特征点和监督局部线性嵌入(SLLE)结合起来,利用人脸表情图像数据本身的非线性流形结构信息和样本标签信息来调整点到点之间的距离,并形成距离矩阵,而后基于被调整的距离矩阵进行线性近邻重建来实现维数约简,提取低维鉴别特征用于人脸表情识别。结果表明该方法能更为有效地提取反映表情状态的特征,识别率优于传统的PCA算法,取得了较好的识别效果。最后实验分析了SLLE算法近邻数K和嵌入维数对识别率的影响,得到了SLLE算法的最优近邻数K和低维嵌入维数。  相似文献   

钟锐  吴怀宇  何云 《计算机科学》2018,45(6):308-313
传统的人脸识别模型采用离线方式进行训练,同时由于人脸特征维数较高导致算法的实时性不足。文中分别从人脸特征与分类器两方面来构建快速的人脸识别算法。首先使用 SDM(Supervised Descent Method)算法进行人脸特征点定位,提取每个人脸特征点邻域内的局部(Multi Block-Center Symmetric Local Binary Patterns,MB-CSLBP)特征,并将所有的人脸特征点邻域特征以串联的方式构成局部融合特征,即所提出的局部融合MB-CSLBP特征LFP-MB-CSLBP(Local Fusion Feature of MB-CSLBP)。将以上特征送入分层增量树HI-tree(Hierarchical Incremental tree)中进行人脸识别模型的在线训练。分层增量树是使用分层聚类算法来实现增量式学习的,因此其能够以在线的方式对识别模型进行训练,具有较高的实时性与准确性。最后在3种不同的人脸库以及摄像头采集的人脸视频上对算法的识别率与实时性进行测试。实验结果表明,相比于当前其他算法,所提算法具有较高的人脸识别率与实时性。  相似文献   

提出一种新的人脸描述及识别方法,首先对归一化后的人脸图像进行多方向多尺度Gabor变换;然后对人脸区域进行分块,以块为单位统计Gabor系数的均值和方差,求得块特征矢量(block feature vector,BFV),按先行后列的顺序将各块的BFV拼接,构成整幅人脸图像特征矢量(face feature vector,FFV).在分类器设计阶段,引入两两比对和投票机制,用多个两类分类器组合成多类分类器.在训练某个具体的两类分类器时,根据隶属训练样本计算FFV中每项的分辨力,以分辨力大小为依据选出最优特征子集(best subset feature vector,BSFV).基于Yale人脸数据集展开实验,与已发表的算法和结果进行对比,证明了该方法的有效性.  相似文献   

Feature Fusion plays an important role in speech emotion recognition to improve the classification accuracy by combining the most popular acoustic features for speech emotion recognition like energy, pitch and mel frequency cepstral coefficients. However the performance of the system is not optimal because of the computational complexity of the system, which occurs due to high dimensional correlated feature set after feature fusion. In this paper, a two stage feature selection method is proposed. In first stage feature selection, appropriate features are selected and fused together for speech emotion recognition. In second stage feature selection, optimal feature subset selection techniques [sequential forward selection (SFS) and sequential floating forward selection (SFFS)] are used to eliminate the curse of dimensionality problem due to high dimensional feature vector after feature fusion. Finally the emotions are classified by using several classifiers like Linear Discriminant Analysis (LDA), Regularized Discriminant Analysis (RDA), Support Vector Machine (SVM) and K Nearest Neighbor (KNN). The performance of overall emotion recognition system is validated over Berlin and Spanish databases by considering classification rate. An optimal uncorrelated feature set is obtained by using SFS and SFFS individually. Results reveal that SFFS is a better choice as a feature subset selection method because SFS suffers from nesting problem i.e it is difficult to discard a feature after it is retained into the set. SFFS eliminates this nesting problem by making the set not to be fixed at any stage but floating up and down during the selection based on the objective function. Experimental results showed that the efficiency of the classifier is improved by 15–20 % with two stage feature selection method when compared with performance of the classifier with feature fusion.  相似文献   

目的 地标识别是图像和视觉领域一个应用问题,针对地标识别中全局特征对视角变化敏感和局部特征对光线变化敏感等单一特征所存在的问题,提出一种基于增量角度域损失(additive angular margin loss,ArcFace损失)并对多种特征进行融合的弱监督地标识别模型。方法 使用图像检索取Top-1的方法来完成识别任务。首先证明了ArcFace损失参数选取的范围,并于模型训练时使用该范围作为参数选取的依据,接着使用一种有效融合局部特征与全局特征的方法来获取图像特征以用于检索。其中,模型训练过程分为两步,第1步是在谷歌地标数据集上使用ArcFace损失函数微调ImageNet预训练模型权重,第2步是增加注意力机制并训练注意力网络。推理过程分为3个部分:抽取全局特征、获取局部特征和特征融合。具体而言,对输入的查询图像,首先从微调卷积神经网络的特征嵌入层提取全局特征;然后在网络中间层使用注意力机制提取局部特征;最后将两种特征向量横向拼接并用图像检索的方法给出数据库中与当前查询图像最相似的结果。结果 实验结果表明,在巴黎、牛津建筑数据集上,特征融合方法可以使浅层网络达到深层预训练网络的效果,融合特征相比于全局特征(mean average precision,mAP)值提升约1%。实验还表明在神经网络嵌入特征上无需再加入特征白化过程。最后在城市级街景图像中本文模型也取得了较为满意的效果。结论 本模型使用ArcFace损失进行训练且使多种特征相似性结果进行有效互补,提升了模型在实际应用场景中的抗干扰能力。  相似文献   

将传统的语义分割SegNet网络用于高分辨率遥感影像的建筑物提取时,分割的建筑物存在边界模糊、精度较低、错检漏检等问题。为了解决上述问题,提出一种改进SegNet网络+CRF语义分割方法。编码阶段的最低分辨率层引入空洞金字塔池化模型,通过并行的空洞卷积操作扩大特征提取的感受野;解码阶段构建特征金字塔实现特征多尺度融合,弥补上采样过程中丢失的特征信息;最后,预测图像送入全连接条件随机场模型进行后处理,优化提取的建筑物边缘。实验表明,相较于原SegNet网络,改进方法的建筑物提取像素精度、召回率、平均交并比分别提高了0.48%、1.29%、2.36%。  相似文献   


With the increasing demands of the remote surveillance system, the gait based personal identification research has obtained more and more attention from biometric recognition researchers. The gait sequence is easier to be affected by factors than other biometric feathers. In order to achieve better performance of the gait based identification system, in the paper, a local discriminant gait recognition method is proposed by integrating weighted adaptive center symmetric local binary pattern (WACS-LBP) with local linear discriminate projection (LLDP). The proposed method consists of two stages. In the first stage, the robust local weighted histogram feature vector is extracted from each gait image by WACS-LBP. In the second stage, the dimensionality of the extracted feature vector is reduced by LLDP. The highlights of the proposed method are (1) the extracted feature is robust to rotation invariant, and is also tolerant to illumination and pose changes; (2) the low dimensional feature vector reduced by LLDP can preserve the discriminating ability; and (3) the small-sample-size (SSS) problem is avoided naturally. The proposed method is validated and compared with the existing algorithms on a public gait database. The experimental results show that the proposed method is not only effective, but also can be clearly interpreted.


Multimedia understanding for high dimensional data is still a challenging work, due to redundant features, noises and insufficient label information it contains. Graph-based semi-supervised feature learning is an effective approach to address this problem. Nevertheless, Existing graph-based semi-supervised methods usually depend on the pre-constructed Laplacian matrix but rarely modify it in the subsequent classification tasks. In this paper, an adaptive local manifold learning based semi-supervised feature selection is proposed. Compared to the state-of-the-art, the proposed algorithm has two advantages: 1) Adaptive local manifold learning and feature selection are integrated jointly into a single framework, where both the labeled and unlabeled data are utilized. Besides, the correlations between different components are also considered. 2) A group sparsity constraint, i.e. l 2?,?1-norm, is imposed to select the most relevant features. We also apply the proposed algorithm to serval kinds of multimedia understanding applications. Experimental results demonstrate the effectiveness of the proposed algorithm.  相似文献   

传统的2D卷积神经网络在进行视频识别时容易丢失目标在时间维度上的相关特征信息,导致识别准确率降低。针对该问题,本文采用3D卷积网络作为基本的网络框架,使用3D卷积核进行卷积操作提取视频中的时空特征,同时集成多个3D卷积神经网络模型对动态手势进行识别。为了提高模型的收敛速度和训练的稳定性,运用批量归一化(BN)技术优化网络,使优化后的网络训练时间缩短。实验结果表明,本文方法对于动态手势的识别具有较好的识别结果,在Sheffield Kinect Gesture (SKIG)数据集上识别准确率达到98.06%。与单独使用RGB信息、深度信息以及传统2D CNN相比,手势识别率均有所提高,验证了本文方法的可行性和有效性。  相似文献   

针对在建建筑区域具有与周围非在建建筑颜色特征不同、与周围自然环境纹理特征不同的特点,提出了一种基于在建建筑颜色和纹理特征的高空影像中在建建筑区域识别方法.首先对只包含在建建筑图像数据集中的图像进行颜色和纹理特征提取,由这些特征矢量构建图像特征索引库;然后将待检测图像分块,对其颜色聚类屏蔽绿色植被区域并计算特征矢量,将其与特征索引库做相似性度量,判定该图像块在整个待检测图中的位置,对检测到的在建建筑用红色矩形框和唯一的标识符框选出来.实验结果显示,利用本文提出的在建建筑区域识别方法,能够有效地识别城市高空影像中的在建建筑区域,基于本文算法的系统可以运用于城市规划.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号