首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Feature pooling is a key component in modern visual classification system. However, the conventional two prevailing pooling techniques, namely average and max poolings, are not theoretically optimal, due to the unrecoverable loss of the spatial information during the statistical summarization and the underlying over-simplified assumption about the feature distribution. Addressing these issues, this paper proposes to generalize previous pooling methods toward a weighted p-norm spatial pooling function tailored for class-specific feature spatial distribution. Optimizing such a pooling function toward discriminative class separability that is subject to a spatial smoothness constraint yields a so-called geometric p-norm pooling (GLP) method. Furthermore, to handle the variation of object scale/position, which would affect not only the learning of discriminative pooling weights but also the applicability of the learned weights, we propose a simple yet effective self-alignment step during both learning and testing to adaptively adjust the pooling weights for individual images. Image segmentation and visual saliency map are utilized to construct a directed pixel adjacency graph. The discriminative pooling weights are diffused using random walk on the constructed graph and therefore the discriminative pooling weights are propagated onto the salient and foreground region. This leads to a robust version of GLP (RGLP) which can cope with the misalignment of object position and scale in images. Comprehensive experiments validate the effectiveness of the proposed GLP feature pooling framework. The proposed random walk based self-alignment step can effectively alleviate the image misalignment issue and further boost classification accuracy. State-of-the-art image classification and action recognition performances are attained on several benchmarks.  相似文献   

2.
In recent years, sparse representation-based classification (SRC) has made great progress in face recognition (FR). However, SRC emphasizes noise sparsity too much and it is not suitable for the real world. In this paper, we propose a robust \(l_{2,1}\)-norm Sparse Representation framework that constrains the noise penalty by the \(l_{2,1}\)-norm. The \(l_{2,1} \)-norm takes advantage of both the discriminative nature of the \(l_1 \)-norm and the systemic representation of the \(l_2 \)-norm. In addition, we use the nuclear norm to constrain the coefficient matrix. Motivated by the Fisher criterion, we propose the Fisher discriminant-based \(l_{2,1} \)-norm sparse representation method for FR which utilizes a supervised approach. Thus, we consider the within-class scatter and between-class scatter when all of the label information is available. The paper shows that the model can provide stronger discriminant power than the classical sparse representation models and can be solved by the alternating direction method of multiplier. Additionally, it is robust to the contiguous occlusion noise. Extensive experiments demonstrate that our method achieves significantly better results than SRC and some other sparse representation methods for FR when addressing large regions with contiguous occlusion.  相似文献   

3.
In this paper, we propose a novel method for fast face recognition called L 1/2-regularized sparse representation using hierarchical feature selection. By employing hierarchical feature selection, we can compress the scale and dimension of global dictionary, which directly contributes to the decrease of computational cost in sparse representation that our approach is strongly rooted in. It consists of Gabor wavelets and extreme learning machine auto-encoder (ELM-AE) hierarchically. For Gabor wavelets’ part, local features can be extracted at multiple scales and orientations to form Gabor-feature-based image, which in turn improves the recognition rate. Besides, in the presence of occluded face image, the scale of Gabor-feature-based global dictionary can be compressed accordingly because redundancies exist in Gabor-feature-based occlusion dictionary. For ELM-AE part, the dimension of Gabor-feature-based global dictionary can be compressed because high-dimensional face images can be rapidly represented by low-dimensional feature. By introducing L 1/2 regularization, our approach can produce sparser and more robust representation compared to L 1-regularized sparse representation-based classification (SRC), which also contributes to the decrease of the computational cost in sparse representation. In comparison with related work such as SRC and Gabor-feature-based SRC, experimental results on a variety of face databases demonstrate the great advantage of our method for computational cost. Moreover, we also achieve approximate or even better recognition rate.  相似文献   

4.
This paper presents a Gaussian sparse representation cooperative model for tracking a target in heavy occlusion video sequences by combining sparse coding and locality-constrained linear coding algorithms. Different from the usual method of using ?1-norm regularization term in the framework of particle filters to form the sparse collaborative appearance model (SCM), we employed the ?1-norm and ?2-norm to calculate feature selection, and then encoded the candidate samples to generate the sparse coefficients. Consequently, our method not only easily obtained sparse solutions but also reduced reconstruction error. Compared to state-of-the-art algorithms, our scheme achieved better performance in heavy occlusion video sequences for tracking a target. Extensive experiments on target tracking were carried out to show the results of our proposed algorithm compared with various other target tracking methods.  相似文献   

5.
This work presents a novel dictionary learning method based on the l2l2-norm regularization to learn a dictionary more suitable for face recognition. By optimizing the reconstruction error for each class using the dictionary atoms associated with that class, we learn a structured dictionary which is able to make the reconstruction error for each class more discriminative for classification. Moreover, to make the coding coefficients of samples coded over the learned dictionary discriminative, a discriminative term bilinear to the training samples and the coding coefficients is incorporated in our dictionary learning model. The bilinear discriminative term essentially resolves a linear regression problem for patterns concatenated by the training samples and the coding coefficients in the Reproducing Kernel Hilbert Space (RKHS). Consequently, a novel classifier based on the bilinear discriminative model is also proposed. Experimental results on the AR, CMU PIE, CAS-PEAL-R1, and the Sheffield (previously UMIST) face databases show that the proposed method is effective to expression, lighting, and pose variations in face recognition as well as gender classification, compared with the recently proposed face recognition methods and dictionary learning methods.  相似文献   

6.
Electrocardiogram (ECG) biometric recognition has emerged as a hot research topic in the past decade.Although some promising results have been reported,especially using sparse representation learning (SRL) and deep neural network,robust identification for small-scale data is still a challenge.To address this issue,we integrate SRL into a deep cascade model,and propose a multi-scale deep cascade bi-forest (MDCBF) model for ECG biometric recognition.We design the bi-forest based feature generator by fusing L1-norm sparsity and L2-norm collaborative representation to efficiently deal with noise.Then we propose a deep cascade framework,which includes multi-scale signal coding and deep cascade coding.In the former,we design an adaptive weighted pooling operation,which can fully explore the discriminative information of segments with low noise.In deep cascade coding,we propose level-wise class coding without backpropagation to mine more discriminative features.Extensive experiments are conducted on four small-scale ECG databases,and the results demonstrate that the proposed method performs competitively with state-of-the-art methods.  相似文献   

7.
Nonnegative matrix factorization has been widely applied recently. The nonnegativity constraints result in parts-based, sparse representations which can be more robust than global, non-sparse features. However, existing techniques could not accurately dominate the sparseness. To address this issue, we present a unified criterion, called Nonnegative Matrix Factorization by Joint Locality-constrained and ? 2,1-norm Regularization(NMF2L), which is designed to simultaneously perform nonnegative matrix factorization and locality constraint as well as to obtain the row sparsity. We reformulate the nonnegative local coordinate factorization problem and use ? 2,1-norm on the coefficient matrix to obtain row sparsity, which results in selecting relevant features. An efficient updating rule is proposed, and its convergence is theoretically guaranteed. Experiments on benchmark face datasets demonstrate the effectiveness of our presented method in comparison to the state-of-the-art methods.  相似文献   

8.
Dictionary learning plays a key role in image representation for classification. A multi-modal dictionary is usually learned from feature samples across different classes and shared in the feature encoding process. Ideally each atom in dictionary corresponds to a single class of images, while each class of images corresponds to a certain group of atoms. Image features are encoded as linear combinations of selected atoms in a given dictionary. We propose to use elastic net as regularizer to select atoms in feature coding and related dictionary learning process, which not only benefits from the sparsity similar as ?1 penalty but also encourages a grouping effect that helps improve image representation. Experimental results of image classification on benchmark datasets show that with dictionary learned in the proposed way outperforms state-of-the-art dictionary learning algorithms.  相似文献   

9.
A frontier estimation method for a set of points on a plane is proposed, being optimal in L1-norm on a given class of β-Holder boundary functions under β ∈ (0, 1]. The estimator is defined as sufficiently regular linear combination of kernel functions centered in the sample points, which covers all these points and whose associated support is of minimal surface. The linear combination weights are calculated via solution of the related linear programming problem. The L1-norm of the estimation error is demonstrated to be convergent to zero with probability one, with the optimal rate of convergence.  相似文献   

10.
目的 现实中采集到的人脸图像通常受到光照、遮挡等环境因素的影响,使得同一类的人脸图像具有不同程度的差异性,不同类的人脸图像又具有不同程度的相似性,这极大地影响了人脸识别的准确性。为了解决上述问题对人脸识别造成的影响,在低秩矩阵恢复理论的基础上提出了具有识别力的结构化低秩字典学习的人脸识别算法。方法 该算法基于训练样本的标签信息将低秩正则化以及结构化稀疏同时引入到学习的具有识别力的字典上。在字典学习过程中,首先利用样本的重建误差约束样本与字典之间的关系;其次将Fisher准则应用到稀疏编码过程中,使其编码系数具有识别能力;由于训练样本中的噪声信息会影响字典的识别力,所以在低秩矩阵恢复理论的基础上将低秩正则化应用到字典学习过程中;接着,在字典学习过程中加入了结构化稀疏使其不丢失结构信息以保证对样本进行最优分类;最后再利用误差重构法对测试样本进行分类识别。结果 本文算法在AR以及ORL人脸数据库上分别进行了实验仿真。在AR人脸数据库中,为了分析样本不同维数对实验结果造成的影响,选取了第一时期拍摄的每人6幅图像,包括1幅围巾遮挡,2幅墨镜遮挡以及3幅脸部表情变化以及光照变化(未被遮挡)的图像作为训练样本,同时选取相同组合的样本图像作为测试样本,无论哪种方法,图像的维度越高识别率越高。对比SRC (sparse representation based on classification)算法与DKSVD (discriminative K-means singular value decomposition)算法的识别率可知,DKSVD算法通过字典学习减缓了训练样本中的不确定因素对识别结果的影响;对比DLRD_SR (discriminative low-rank dictionary learning for sparse representation)算法与FDDL (Fisher discriminative dictionary learning)算法的识别率可知,当图像有遮挡等噪声信息存在时,字典低秩化可以提高至少5.8%的识别率;对比本文算法与DLRD_SR算法可知,在字典学习的过程中加入Fisher准则后识别率显著提高,同时理想稀疏值能保证对样本进行最优的分类。当样本图像的维度达到500维时人脸图像在有围巾、墨镜遮挡的情况下识别率可达到85.2%;其中墨镜和围巾的遮挡程度分别可以看成是人脸图像的20%和40%,为了验证本文算法在不同脸部表情变化、光照改变以及遮挡情况下的有效性,根据训练样本的具体图像组合情况进行实验。无论哪种样本图像组合,本文算法在有遮挡存在的样本识别中具有显著优势。在训练样本只包含脸部表情变化、光照变化以及墨镜遮挡图像的情况下,本文算法的识别率高于其他算法至少2.7%,在训练样本只包含脸部表情变化、光照变化以及围巾遮挡图像的情况下,本文算法的识别率高于其他算法至少3.6%,在训练样本包含脸部表情变化、光照变化、围巾遮挡以及墨镜遮挡图像的情况下,其识别率高于其他算法至少1.9%。在ORL人脸数据库中,人脸图像在无遮挡的情况下识别率达到95.2%,稍低于FDDL算法的识别率;在随机块遮挡程度达到20%时,相比较于SRC算法、DKSVD算法、FDDL算法以及DLRD_SR算法,本文算法的识别率最高;当随机块遮挡程度达到50%时,以上算法的识别率均不高,但本文算法的其识别率仍然最高。结论 本文算法在人脸图像受到遮挡等因素的影响时具有一定的鲁棒性,实验结果表明该算法在人脸识别方面具有可行性。  相似文献   

11.
We study several embeddings of doubling metrics into low dimensional normed spaces, in particular into ? 2 and ? . Doubling metrics are a robust class of metric spaces that have low intrinsic dimension, and often occur in applications. Understanding the dimension required for a concise representation of such metrics is a fundamental open problem in the area of metric embedding. Here we show that the n-vertex Laakso graph can be embedded into constant dimensional ? 2 with the best possible distortion, which has implications for possible approaches to the above problem.  相似文献   

12.
目的 针对因采集的人脸图像样本受到污染而严重干扰人脸识别及训练样本较少(小样本)时会由于错误的稀疏系数导致性能急剧下降从而影响人脸识别的问题,提出了一种基于判别性非凸低秩矩阵分解的叠加线性稀疏表示算法。方法 首先由γ范数取代传统核范数,克服了传统低秩矩阵分解方法求解核范数时因矩阵奇异值倍数缩放导致的识别误差问题;然后引入结构不相干判别项,以增加不同类低秩字典间的非相干性,达到抑制类内变化和去除类间相关性的目的;最后利用叠加线性稀疏表示方法完成分类。结果 所提算法在AR人脸库中的识别率达到了98.67±0.57%,高于SRC(sparse representation-based classification)、ESRC(extended SRC)、RPCA(robust principal component analysis)+SRC、LRSI(low rank matrix decomposition with structural incoherence)、SLRC(superposed linear representation based classification)-l1等算法;同时,遮挡实验表明,算法对遮挡图像具有更好的鲁棒性,在不同遮挡比例下,相比其他算法均有更高的识别率。在CMU PIE人脸库中,对无遮挡图像添加0、10%、20%、30%、40%的椒盐噪声,算法识别率分别达到90.1%、85.5%、77.8%、65.3%和46.1%,均高于其他算法。结论 不同人脸库、不同比例遮挡和噪声的实验结果表明,所提算法针对人脸遮挡、表情和光照等噪声因素依然保持较高的识别率,鲁棒性更好。  相似文献   

13.
Consideration is given to the problem of signal estimation against the background of the white noise when the information about the signal is represented in the form of numerical characteristics such as constraints on the variance of the signal itself and variances of some its derivatives. We proposed a method to solve this problem using the technique of filtering in the time domain by minimizing the functional that is a combination of the H 2-norm of the transfer function from the measurement noise to the error of estimation and the H -norm of the transfer function from the generating noise to the error of estimation.  相似文献   

14.
The goal of blind image deblurring is to estimate the blur kernel and restore the sharp latent image based on an input blur image. This paper proposes a novel blind image deblurring algorithm based on L0-regularization and kernel shape optimization. Firstly, the proposed objective function of the optimization model is formulated with L0-norm terms of the gradient and intensity of kernels, which results to good sparsity and less noise in the obtained kernel. Then, the coarse-to-fine iterative framework is adopted to estimate reliable salient image structures implicitly, which can reduce computation and accelerate convergence. Finally, the kernel shape is optimized by weighting method, which enables the obtained kernel closer to the ground-truth. Experimental results on public bench mark datasets demonstrate that restored images are clear with less ring-artifacts.  相似文献   

15.
In negation-limited complexity, one considers circuits with a limited number of NOT gates, being motivated by the gap in our understanding of monotone versus general circuit complexity, and hoping to better understand the power of NOT gates. We give improved lower bounds for the size (the number of AND/OR/NOT) of negation-limited circuits computing Parity and for the size of negation-limited inverters. An inverter is a circuit with inputs x 1,…,x n and outputs ¬ x 1,…,¬ x n . We show that: (a) for n=2 r ?1, circuits computing Parity with r?1 NOT gates have size at least 6n?log?2(n+1)?O(1), and (b) for n=2 r ?1, inverters with r NOT gates have size at least 8n?log?2(n+1)?O(1). We derive our bounds above by considering the minimum size of a circuit with at most r NOT gates that computes Parity for sorted inputs x 1???x n . For an arbitrary r, we completely determine the minimum size. It is 2n?r?2 for odd n and 2n?r?1 for even n for ?log?2(n+1)??1≤rn/2, and it is ?3n/2??1 for rn/2. We also determine the minimum size of an inverter for sorted inputs with at most r NOT gates. It is 4n?3r for ?log?2(n+1)?≤rn. In particular, the negation-limited inverter for sorted inputs due to Fischer, which is a core component in all the known constructions of negation-limited inverters, is shown to have the minimum possible size. Our fairly simple lower bound proofs use gate elimination arguments in a somewhat novel way.  相似文献   

16.
Recently, sparse subspace clustering, as a subspace learning technique, has been successfully applied to several computer vision applications, e.g. face clustering and motion segmentation. The main idea of sparse subspace clustering is to learn an effective sparse representation that are used to construct an affinity matrix for spectral clustering. While most of existing sparse subspace clustering algorithms and its extensions seek the forms of convex relaxation, the use of non-convex and non-smooth l q (0 < q < 1) norm has demonstrated better recovery performance. In this paper we propose an l q norm based Sparse Subspace Clustering method (lqSSC), which is motivated by the recent work that l q norm can enhance the sparsity and make better approximation to l 0 than l 1. However, the optimization of l q norm with multiple constraints is much difficult. To solve this non-convex problem, we make use of the Alternating Direction Method of Multipliers (ADMM) for solving the l q norm optimization, updating the variables in an alternating minimization way. ADMM splits the unconstrained optimization into multiple terms, such that the l q norm term can be solved via Smooth Iterative Reweighted Least Square (SIRLS), which converges with guarantee. Different from traditional IRLS algorithms, the proposed algorithm is based on gradient descent with adaptive weight, making it well suit for general sparse subspace clustering problem. Experiments on computer vision tasks (synthetic data, face clustering and motion segmentation) demonstrate that the proposed approach achieves considerable improvement of clustering accuracy than the convex based subspace clustering methods.  相似文献   

17.
Sparse representation is a mathematical model for data representation that has proved to be a powerful tool for solving problems in various fields such as pattern recognition, machine learning, and computer vision. As one of the building blocks of the sparse representation method, dictionary learning plays an important role in the minimization of the reconstruction error between the original signal and its sparse representation in the space of the learned dictionary. Although using training samples directly as dictionary bases can achieve good performance, the main drawback of this method is that it may result in a very large and inefficient dictionary due to noisy training instances. To obtain a smaller and more representative dictionary, in this paper, we propose an approach called Laplacian sparse dictionary (LSD) learning. Our method is based on manifold learning and double sparsity. We incorporate the Laplacian weighted graph in the sparse representation model and impose the l1-norm sparsity on the dictionary. An LSD is a sparse overcomplete dictionary that can preserve the intrinsic structure of the data and learn a smaller dictionary for each class. The learned LSD can be easily integrated into a classification framework based on sparse representation. We compare the proposed method with other methods using three benchmark-controlled face image databases, Extended Yale B, ORL, and AR, and one uncontrolled person image dataset, i-LIDS-MA. Results show the advantages of the proposed LSD algorithm over state-of-the-art sparse representation based classification methods.  相似文献   

18.
This paper introduces α-systems of differential inclusions on a bounded time interval [t0, ?] and defines α-weakly invariant sets in [t0, ?] × ?n, where ?n is a phase space of the differential inclusions. We study the problems connected with bringing the motions (trajectories) of the differential inclusions from an α-system to a given compact set M ? ?n at the moment ? (the approach problems). The issues of extracting the solvability set W ? [t0, ?] × ?n in the problem of bringing the motions of an α-system to M and the issues of calculating the maximal α-weakly invariant set Wc ? [t0, ?] × ?n are also discussed. The notion of the quasi-Hamiltonian of an α-system (α-Hamiltonian) is proposed, which seems important for the problems of bringing the motions of the α-system to M.  相似文献   

19.
This article presents three new methods (M5, M6, M7) for the estimation of an unknown map projection and its parameters. Such an analysis is beneficial and interesting for historic, old, or current maps without information about the map projection; it could improve their georeference. The location similarity approach takes into account the residuals on the corresponding features; the minimum is found using the non-linear least squares. For the shape similarity approach, the minimized objective function ? takes into account the spatial distribution of the features, together with the shapes of the meridians, parallels and other 0D-2D elements. Due to the non-convexity and discontinuity, its global minimum is determined using the global optimization, represented by the differential evolution. The constant values of projection φ k , λ k , φ 1, λ 0, and map constants RXY, α (in relation to which the methods are invariant) are estimated. All methods are compared and the results are presented for the synthetic data as well as for 8 early maps from the Map Collection of the Charles University and the David Rumsay Map Collection. The proposed algorithms have been implemented in the new version of the detectproj software.  相似文献   

20.
Let Z/(pe) be the integer residue ring modulo pe with p an odd prime and e ≥ 2. We consider the suniform property of compressing sequences derived from primitive sequences over Z/(pe). We give necessary and sufficient conditions for two compressing sequences to be s-uniform with α provided that the compressing map is of the form ?(x0, x1,...,xe?1) = g(xe?1) + η(x0, x1,..., xe?2), where g(xe?1) is a permutation polynomial over Z/(p) and η is an (e ? 1)-variable polynomial over Z/(p).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号