首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
黎曼流形上的保局投影在图像集匹配中的应用   总被引:1,自引:1,他引:0       下载免费PDF全文
目的提出了黎曼流形上局部结构特征保持的图像集匹配方法。方法该方法使用协方差矩阵建模图像集合,利用对称正定的非奇异协方差矩阵构成黎曼流形上的子空间,将图像集的匹配转化为流形上的点的匹配问题。通过基于协方差矩阵度量学习的核函数将黎曼流形上的协方差矩阵映射到欧几里德空间。不同于其他方法黎曼流形上的鉴别分析方法,考虑到样本分布的局部几何结构,引入了黎曼流形上局部保持的图像集鉴别分析方法,保持样本分布的局部邻域结构的同时提升样本的可分性。结果在基于图像集合的对象识别任务上测试了本文算法,在ETH80和YouTube Celebrities数据库分别进行了对象识别和人脸识别实验,分别达到91.5%和65.31%的识别率。结论实验结果表明,该方法取得了优于其他图像集匹配算法的效果。  相似文献   

2.
Most face recognition techniques have been successful in dealing with high-resolution (HR) frontal face images. However, real-world face recognition systems are often confronted with the low-resolution (LR) face images with pose and illumination variations. This is a very challenging issue, especially under the constraint of using only a single gallery image per person. To address the problem, we propose a novel approach called coupled kernel-based enhanced discriminant analysis (CKEDA). CKEDA aims to simultaneously project the features from LR non-frontal probe images and HR frontal gallery ones into a common space where discrimination property is maximized. There are four advantages of the proposed approach: 1) by using the appropriate kernel function, the data becomes linearly separable, which is beneficial for recognition; 2) inspired by linear discriminant analysis (LDA), we integrate multiple discriminant factors into our objective function to enhance the discrimination property; 3) we use the gallery extended trick to improve the recognition performance for a single gallery image per person problem; 4) our approach can address the problem of matching LR non-frontal probe images with HR frontal gallery images, which is difficult for most existing face recognition techniques. Experimental evaluation on the multi-PIE dataset signifies highly competitive performance of our algorithm.   相似文献   

3.
I-vector说话人识别系统常用距离来衡量说话人语音间的相似度。加权成对约束度量学习算法(WPCML)利用成对训练样本的加权约束信息训练一个用于计算马氏距离的度量矩阵。该度量矩阵表示的样本空间中,同类样本间的距离更小,非同类样本间的距离更大。在美国国家标准技术局(NIST)2008年说话人识别评测数据库(SRE08)的实验结果表明,WPCML算法训练度量矩阵用于马氏距离相似度打分,比用余弦距离相似度打分的性能更好。选择训练样本对方法用于构造度量学习训练样本集能进一步提高系统实验性能,并优于目前最流行的PLDA分类器。  相似文献   

4.
于谦  高阳  霍静  庄韫恺 《软件学报》2015,26(11):2897-2911
将基于视频的人脸识别转换为图像集识别问题,并提出两种流形来表示每个图像集:一种是类间流形,表示每个图像集的平均脸信息;另一种是类内流形,表示每个图像集的所有原始图像的信息.类间流形针对图像集之间的区别提取整体判别信息,作用是选出几个与待识别图像集较为相似的候选图像集.类内流形则考虑图像集内各原始图像之间的关系,负责从候选图像集中找出最为相似的一个.不同于现有的非线性流形方法中每幅图像对应流形中的一个点,采用分片技术学习两种流形的投影矩阵,每个分片对应流形中的一个点,所学到的特征更具有判别性,进而使流形边界更加清晰,同时解决了传统非线性流形方法中的角度偏差和不充分采样问题.还提出了与分片技术相匹配的流形之间的距离度量方法.最后在几个广为研究的数据集上进行了实验,结果表明:新方法的识别准确率高,尤其适用于不受控环境下的视频识别,而且不受视频段长短的影响.  相似文献   

5.
In this paper we describe an experiment where we studied empirically the application of a learned distance metric to be used as discrimination function for an established color image segmentation algorithm. For this purpose we chose the Mumford–Shah energy functional and the Mahalanobis distance metric. The objective was to test our approach in an objective and quantifiable way on this specific algorithm employing this particular distance model, without making generalization claims. The empirical validation of the results was performed in two experiments: one applying the resulting segmentation method on a subset of the Berkeley Image Database, an exemplar image set possessing ground-truths and validating the results against the ground-truths using two well-known inter-cluster validation methods, namely, the Rand and BGM indexes, and another experiment using images of the same context divided into training and testing set, where the distance metric is learned from the training set and then applied to segment all the images. The obtained results suggest that the use of the specified learned distance metric provides better and more robust segmentations, even if no other modification of the segmentation algorithm is performed.  相似文献   

6.
为了利用图像集中的集合信息来提高图像识别精度以及对图像变化的鲁棒性,从而大幅降低诸如姿态、光照、遮挡和未对齐等因素对识别精度的影响,提出了一种用于图像集分类的图像集原型与投影学习算法(LPSOP)。该算法针对每个图像集学习有代表性的点(原型)以及一个正交的全局投影矩阵,使得在目标子空间的每个图像集可以被最优地分类到同类的最近原型集中。用学习到的原型来代表该图像集,既能降低冗余图像干扰,又能减少存储和计算开销,学习到的投影矩阵则能够大幅提高分类精度与噪声鲁棒性。在UCSD/Honda、CMU MoBo和YouTube celebrities这三个数据集上的实验结果表明,LPSOP比目前流行的图像集分类算法具有更高的识别精度和更好的鲁棒性。  相似文献   

7.
We introduce a method that enables scalable similarity search for learned metrics. Given pairwise similarity and dissimilarity constraints between some examples, we learn a Mahalanobis distance function that captures the examples' underlying relationships well. To allow sublinear time similarity search under the learned metric, we show how to encode the learned metric parameterization into randomized locality-sensitive hash functions. We further formulate an indirect solution that enables metric learning and hashing for vector spaces whose high dimensionality makes it infeasible to learn an explicit transformation over the feature dimensions. We demonstrate the approach applied to a variety of image data sets, as well as a systems data set. The learned metrics improve accuracy relative to commonly used metric baselines, while our hashing construction enables efficient indexing with learned distances and very large databases.  相似文献   

8.
任珍文  吴明娜 《计算机应用》2019,39(9):2547-2551
图像集分类算法通过充分利用图像的集合信息来提高识别性能,得到了广泛的关注。但是现有的图像集分类算法存在如下问题:1)需要样本满足某种概率统计分布;2)忽略了图库集类与类之间的互斥性;3)对非高斯噪声不具备鲁棒性。为了解决上述问题,提出了一种基于熵自加权联合正则化最近点的图像集分类算法(SRNPC)。首先在测试集中寻找唯一的全局联合正则化最近点,同时最小化该点与每个图库集中正则化最近点之间的距离;然后,为了增强类之间的判别力以及对非高斯噪声的鲁棒性,引入一种基于熵尺度的自加权策略来迭代更新测试集与各个图库集合之间的熵加权权重,得到的权重能够直接反映测试集与每个图库集之间相关性的高低;最后,利用测试集和每个图库集之间的最小残差值获得分类结果。通过在UCSD/Honda、CMU Mobo和YouTube这三个公开数据集上与当前主流的算法进行的对比实验结果表明,所提出的算法具有更高的分类精度和更强的鲁棒性。  相似文献   

9.
Many classification algorithms see a reduction in performance when tested on data with properties different from that used for training. This problem arises very naturally in face recognition where images corresponding to the source domain (gallery, training data) and the target domain (probe, testing data) are acquired under varying degree of factors such as illumination, expression, blur and alignment. In this paper, we account for the domain shift by deriving a latent subspace or domain, which jointly characterizes the multifactor variations using appropriate image formation models for each factor. We formulate the latent domain as a product of Grassmann manifolds based on the underlying geometry of the tensor space, and perform recognition across domain shift using statistics consistent with the tensor geometry. More specifically, given a face image from the source or target domain, we first synthesize multiple images of that subject under different illuminations, blur conditions and 2D perturbations to form a tensor representation of the face. The orthogonal matrices obtained from the decomposition of this tensor, where each matrix corresponds to a factor variation, are used to characterize the subject as a point on a product of Grassmann manifolds. For cases with only one image per subject in the source domain, the identity of target domain faces is estimated using the geodesic distance on product manifolds. When multiple images per subject are available, an extension of kernel discriminant analysis is developed using a novel kernel based on the projection metric on product spaces. Furthermore, a probabilistic approach to the problem of classifying image sets on product manifolds is introduced. We demonstrate the effectiveness of our approach through comprehensive evaluations on constrained and unconstrained face datasets, including still images and videos.  相似文献   

10.
Understanding the effect of blur is an important problem in unconstrained visual analysis. We address this problem in the context of image-based recognition by a fusion of image-formation models and differential geometric tools. First, we discuss the space spanned by blurred versions of an image and then, under certain assumptions, provide a differential geometric analysis of that space. More specifically, we create a subspace resulting from convolution of an image with a complete set of orthonormal basis functions of a prespecified maximum size (that can represent an arbitrary blur kernel within that size), and show that the corresponding subspaces created from a clean image and its blurred versions are equal under the ideal case of zero noise and some assumptions on the properties of blur kernels. We then study the practical utility of this subspace representation for the problem of direct recognition of blurred faces by viewing the subspaces as points on the Grassmann manifold and present methods to perform recognition for cases where the blur is both homogenous and spatially varying. We empirically analyze the effect of noise, as well as the presence of other facial variations between the gallery and probe images, and provide comparisons with existing approaches on standard data sets.  相似文献   

11.
This paper proposes a novel method for recognizing faces degraded by blur using deblurring of facial images. The main issue is how to infer a Point Spread Function (PSF) representing the process of blur on faces. Inferring a PSF from a single facial image is an ill-posed problem. Our method uses learned prior information derived from a training set of blurred faces to make the problem more tractable. We construct a feature space such that blurred faces degraded by the same PSF are similar to one another. We learn statistical models that represent prior knowledge of predefined PSF sets in this feature space. A query image of unknown blur is compared with each model and the closest one is selected for PSF inference. The query image is deblurred using the PSF corresponding to that model and is thus ready for recognition. Experiments on a large face database (FERET) artificially degraded by focus or motion blur show that our method substantially improves the recognition performance compared to existing methods. We also demonstrate improved performance on real blurred images on the FRGC 1.0 face database. Furthermore, we show and explain how combining the proposed facial deblur inference with the local phase quantization (LPQ) method can further enhance the performance.  相似文献   

12.
稀疏表示在人脸识别问题上取得了非常优秀的识别结果,但在单样本条件下,算法性能下降严重。为提高单样本条件下稀疏表示的应用能力,提出一种鲁棒稀疏表示单样本人脸识别算法(RSR)。通过使用每张人脸图像创建一组位置图像,扩充每个对象训练样本,并利用L2,1范数约束,保证RSR选择正确对象的位置图像。在AR和Extended Yale B人脸数据库上进行评测,实验结果表明RSR能够有效处理存在遮挡或光照变化的人脸图像,获得了较好的单样本人脸识别准确率,具有很强的鲁棒性。  相似文献   

13.
A prototype reduction algorithm is proposed, which simultaneously trains both a reduced set of prototypes and a suitable local metric for these prototypes. Starting with an initial selection of a small number of prototypes, it iteratively adjusts both the position (features) of these prototypes and the corresponding local-metric weights. The resulting prototypes/metric combination minimizes a suitable estimation of the classification error probability. Good performance of this algorithm is assessed through experiments with a number of benchmark data sets and with a real task consisting in the verification of images of human faces.  相似文献   

14.
Image automatic annotation is a significant and challenging problem in pattern recognition and computer vision. Current image annotation models almost used all the training images to estimate joint generation probabilities between images and keywords, which would inevitably bring a lot of irrelevant images. To solve the above problem, we propose a hierarchical image annotation model which combines advantages of discriminative model and generative model. In first annotation layer, discriminative model is used to assign topic annotations to unlabeled images, and then relevant image set corresponding to each unlabeled image is obtained. In second annotation layer, we propose a keywords-oriented method to establish links between images and keywords, and then our iterative algorithm is used to expand relevant image sets. Candidate labels will be given higher weights by using our method based on visual keywords. Finally, generative model is used to assign detailed annotations to unlabeled images on expanded relevant image sets. Experiments conducted on Corel 5K datasets verify the effectiveness of our hierarchical image annotation model.  相似文献   

15.
This paper presents methods of modeling and predicting face recognition (FR) system performance based on analysis of similarity scores. We define the performance of an FR system as its recognition accuracy, and consider the intrinsic and extrinsic factors affecting its performance. The intrinsic factors of an FR system include the gallery images, the FR algorithm, and the tuning parameters. The extrinsic factors include mainly query image conditions. For performance modeling, we propose the concept of "perfect recognition", based on which a performance metric is extracted from perfect recognition similarity scores (PRSS) to relate the performance of an FR system to its intrinsic factors. The PRSS performance metric allows tuning FR algorithm parameters offline for near optimal performance. In addition, the performance metric extracted from query images is used to adjust face alignment parameters online for improved performance. For online prediction of the performance of an FR system on query images, features are extracted from the actual recognition similarity scores and their corresponding PRSS. Using such features, we can predict online if an individual query image can be correctly matched by the FR system, based on which we can reduce the incorrect match rates. Experimental results demonstrate that the performance of an FR system can be significantly improved using the presented methods  相似文献   

16.
基于支持向量机的人脸识别方法   总被引:8,自引:0,他引:8  
1.引言人脸是人类视觉中的常见模式,人脸识别在安全验证系统、公安(犯罪识别等)、医学、视频会议、交通量控制等方面有着广阔的应用前景。现有的基于生物特征的识别技术,包括语音识别、虹膜识别、指纹识别等,都已用于商业应用。然而最吸引人的还是人脸识别,因为从人机交互的方式来看,人脸识别更符合人们的理想。虽然人能毫不费力地识别出人脸及其表情,但人脸的机器自动识别仍然是一个具挑战性的研究领域。由于人脸结构的复杂性以及人脸表情的多样性、成像过  相似文献   

17.
Face recognition based on image set has attracted much attention due to its promising performance to overcome various variations. Recently, classifiers of regularized nearest points, including sparse approximated nearest points (SANP), regularized nearest points (RNP) and collaborative regularized nearest points (CRNP), have achieved state-of-the-art performance for image set based face recognition. From a query set and a single-class gallery set, SANP and RNP both generate a pair of nearest points, between which the distance is regarded as the between-set distance. However, the computing of nearest points for each single-class gallery set in SANP and RNP ignores collaboration and competition with other classes, which may cause a wrong-class gallery set to have a small between-set distance. CRNP used collaborative representation to overcome this shortcoming but it doesn't explicitly minimize the between-set distance. In order to solve these issues and fully exploit the advantages of nearest points based approaches, in this paper a novel joint regularized nearest points (JRNP) is proposed for face recognition based on image sets. In JRNP, the nearest point in the query set is generated by considering the entire gallery set of all classes; at the same time, JRNP explicitly minimizes the between-set distance of the query set and a single-class gallery set. Furthermore, we proposed algorithms of greedy JRNP and adaptive JRNP to solve the presented model, and the classification is then based on the joint distance between the regularized nearest points in image sets. Extensive experiments were conducted on benchmark databases (e.g., Honda/UCSD, CMU Mobo, You Tube Celebrities databases, and the large-scale You Tube Face datasets). The experimental results clearly show that our JRNP leads the performance in face recognition based on image sets.  相似文献   

18.
How to organize and retrieve images is now a great challenge in various domains. Image clustering is a key tool in some practical applications including image retrieval and understanding. Traditional image clustering algorithms consider a single set of features and use ad hoc distance functions, such as Euclidean distance, to measure the similarity between samples. However, multi-modal features can be extracted from images. The dimension of multi-modal data is very high. In addition, we usually have several, but not many labeled images, which lead to semi-supervised learning. In this paper, we propose a framework of image clustering based on semi-supervised distance learning and multi-modal information. First we fuse multiple features and utilize a small amount of labeled images for semi-supervised metric learning. Then we compute similarity with the Gaussian similarity function and the learned metric. Finally, we construct a semi-supervised Laplace matrix for spectral clustering and propose an effective clustering method. Extensive experiments on some image data sets show the competent performance of the proposed algorithm.  相似文献   

19.
Recognizing face images across pose is one of the challenging tasks for reliable face recognition. This paper presents a new method to tackle this challenge based on orthogonal discriminant vector (ODV). The result of our theoretical analysis shows that an individual’s probe image captured with a new pose can be represented by a linear combination of his/her gallery images. Based on this observation, in contrast to the conventional methods which model face images of different individuals on a single manifold, we propose to model face images of different individuals on different linear manifolds. The contribution of our approach includes: (1) to prove that the orthogonality to ODVs is a pose-invariant feature.; (2) to categorize each person with a set of ODVs, where his/her face images posses zero projections while other persons’ images are characterized by maximum projections; (3) to define a metric to measure the distance between a face image and an ODV, and classify the face images based on this metric. Our experimental results validate the feasibility of modeling the face images of different individuals on different linear manifolds. The proposed method achieves higher accuracy on face recognition and verification than the existing techniques.  相似文献   

20.
刘博  景丽萍  于剑 《软件学报》2017,28(8):2113-2125
随着视频采集和网络传输技术的快速发展,以及个人移动终端设备的广泛使用,大量图像数据以集合形式存在.由于集合内在结构的复杂性,使得图像集分类的一个关键问题是如何度量集合间距离.为了解决这一问题,本文提出了一种基于双稀疏正则的图像集距离学习框架(DSRID).在该框架中,两集合间距离被建模成其对应的内部典型子结构间的距离,从而保证了度量的鲁棒性和判别性.根据不同的集合表示方法,本文给出了其在传统的欧式空间,以及两个常见的流形空间,即对称正定矩阵流形(symmetric positive definite matrices manifold,SPD manifold)和格林斯曼流形(Grassmann manifold)上的实现.在一系列的基于集合的人脸识别、动作识别和物体分类任务中验证了该框架的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号