首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent researches have shown that the sparse representation based technology can lead to state of art super-resolution image reconstruction (SRIR) result. It relies on the idea that the low-resolution (LR) image patches can be regarded as down sampled version of high-resolution (HR) images, whose patches are assumed to have a sparser presentation with respect to a dictionary of prototype patches. In order to avoid a large training patches database and obtain more accurate recovery of HR images, in this paper we introduce the concept of examples-aided redundant dictionary learning into the single-image super-resolution reconstruction, and propose a multiple dictionaries learning scheme inspired by multitask learning. Compact redundant dictionaries are learned from samples classified by K-means clustering in order to provide each sample a more appropriate dictionary for image reconstruction. Compared with the available SRIR methods, the proposed method has the following characteristics: (1) introducing the example patches-aided dictionary learning in the sparse representation based SRIR, in order to reduce the intensive computation complexity brought by enormous dictionary, (2) using the multitask learning and prior from HR image examples to reconstruct similar HR images to obtain better reconstruction result and (3) adopting the offline dictionaries learning and online reconstruction, making a rapid reconstruction possible. Some experiments are taken on testing the proposed method on some natural images, and the results show that a small set of randomly chosen raw patches from training images and small number of atoms can produce good reconstruction result. Both the visual result and the numerical guidelines prove its superiority to some start-of-art SRIR methods.  相似文献   

2.
Dimensionality reduction methods (DRs) have commonly been used as a principled way to understand the high-dimensional data such as face images. In this paper, we propose a new unsupervised DR method called sparsity preserving projections (SPP). Unlike many existing techniques such as local preserving projection (LPP) and neighborhood preserving embedding (NPE), where local neighborhood information is preserved during the DR procedure, SPP aims to preserve the sparse reconstructive relationship of the data, which is achieved by minimizing a L1 regularization-related objective function. The obtained projections are invariant to rotations, rescalings and translations of the data, and more importantly, they contain natural discriminating information even if no class labels are provided. Moreover, SPP chooses its neighborhood automatically and hence can be more conveniently used in practice compared to LPP and NPE. The feasibility and effectiveness of the proposed method is verified on three popular face databases (Yale, AR and Extended Yale B) with promising results.  相似文献   

3.
Sparse Representation Method has been proved to outperform conventional face recognition (FR) methods and is widely applied in recent years. A novel Kernel-based Sparse Representation Method (KBSRM) is proposed in this paper. In order to cope with the possible complex variation of the face images caused by varying facial expression and pose, the KBSRM first uses a kernel-induced distance to determine N nearest neighbors of the testing sample from all the training samples. Then, in the second step, the KBSRM represents the testing sample as a linear combination of the determinate N nearest neighbors and performs the classification by the representation result. It can be inferred that the N nearest training samples selected are closer to the test sample than the rest, so using the N nearest neighbors to represent the testing sample can make the ultimate classification more accurate. A number of FR experiments show that the KBSRM can achieve a better classification result than the algorithm mentioned in Xu et al. (Neural Comput Appl doi:10.1007/s00521-012-0833-5).  相似文献   

4.
Kernel sparse representation based classification   总被引:5,自引:0,他引:5  
Sparse representation has attracted great attention in the past few years. Sparse representation based classification (SRC) algorithm was developed and successfully used for classification. In this paper, a kernel sparse representation based classification (KSRC) algorithm is proposed. Samples are mapped into a high dimensional feature space first and then SRC is performed in this new feature space by utilizing kernel trick. Since samples in the high dimensional feature space are unknown, we cannot perform KSRC directly. In order to overcome this difficulty, we give the method to solve the problem of sparse representation in the high dimensional feature space. If an appropriate kernel is selected, in the high dimensional feature space, a test sample is probably represented as the linear combination of training samples of the same class more accurately. Therefore, KSRC has more powerful classification ability than SRC. Experiments of face recognition, palmprint recognition and finger-knuckle-print recognition demonstrate the effectiveness of KSRC.  相似文献   

5.
在现有的基于稀疏表示分类算法的人脸识别中,使用通过稀疏学习得到的精简字典可以提高识别速度和精确度。metaface学习(Metaface Learning,MFL)算法在字典学习过程中没有考虑同类样本稀疏编码系数之间具有相似性的特点。为了利用这一信息来提高字典的区分性,提出了一种基于系数相似性的metaface学习(Coefficient-Simi-larity-based Metaface earning,CS-MFL)算法。CS-MFL算法的学习过程中,在更新稀疏表示系数阶段加入同类训练样本稀疏编码系数相似的约束项。为了求解包含系数相似性约束的新的最优化问题,将目标函数中的两个l2范数约束项进行合并,将原问题转化为典型l2- l1问题进行求解。在不同的人脸库上进行实验,结果表明,提出的CS-MFL算法能够获得比MFL算法更高的识别率,说明由CS-MFL算法学习得到的字典更高效且更具区分性。  相似文献   

6.
对稀疏表示在人脸识别中的应用进行了研究,提出了人脸识别的非负稀疏表示方法和采样方法.提出了非负稀疏表示的乘性迭代算法,分析了该方法与非负矩阵分解的联系,设计了基于非负稀疏表示的分类算法.在仿射传播算法的基础上,提出了人脸数据集的采样方法,并在人脸图像集上进行了实验.与稀疏表示相比,非负稀疏表示在计算复杂度和鲁棒性上具有优越性;与随机采样方法相比,该采样方法具有较高的识别精度.  相似文献   

7.
In this paper, we propose a robust tracking algorithm to handle drifting problem. This algorithm consists of two parts: the first part is the G&D part that combines Generative model and Discriminative model for tracking, and the second part is the View-Based model for target appearance that corrects the result of the G&D part if necessary. In G&D part, we use the Maximum Margin Projection (MMP) to construct a graph model to preserve both local geometrical and discriminant structures of the data manifold in low dimensions. Therefore, such discriminative subspace combined with traditional generative subspace can benefit from both models. In addition, we address the problem of learning maximum margin projection under the Spectral Regression (SR) which results in significant savings in computational time. To further solve the drift, an online learned sparsely represented view-based model of the target is complementary to the G&D part. When the result of G&D part is unreliable, the view-based model can rectify the result in order to avoid drifting. Experimental results on several challenging video sequences demonstrate the effectiveness and robustness of our approach.  相似文献   

8.
We propose a novel sequence alignment algorithm for recognizing handwriting gestures by a camera. In the proposed method, an input image sequence is aligned to the reference sequences by phase-synchronization of analytic signals which are transformed from original feature values. A cumulative distance is calculated simultaneously with the alignment process, and then used for the classification. A major benefit of this method is that over-fitting to sequences of incorrect categories is restricted. The proposed method exhibited higher recognition accuracy in handwriting gesture recognition, compared with the conventional dynamic time warping method which explores optimal alignment results for all categories.  相似文献   

9.
基于加权分块稀疏表示的光照鲁棒性人脸识别   总被引:1,自引:0,他引:1  
针对光照变化对人脸识别的效果带来严重影响,提出一种对人脸识别的光照变化具有鲁棒性的方法,即基于加权分块稀疏表示的人脸识别方法。该方法首先对人脸图像进行离散余弦变换(DCT),通过去除 DCT 系数的低频部分来移除光照变化分量。通过反离散余弦变换得到光照归一化后的人脸图像,将人脸图像分块,独立地对每个子块作基于稀疏表示的分类,并对每个子块的分类结果进行加权投票得出测试人脸图像的类别。在 Yale B、extended-Yale B、CMU-PIE 和 FERET 人脸库上进行实验,实验结果表明该方法适用于光照鲁棒的人脸识别。  相似文献   

10.
董丽梦  李锵  关欣 《计算机工程与应用》2012,48(29):133-136,219
和弦识别作为音乐信息标注的基础,在分析音乐结构和旋律方面具有非常重要的作用.结合音乐理论知识,提出一种基于稀疏表示分类器的和弦识别方法.与传统的基于帧的识别方法不同,以节拍作为和弦变化的最小时间间隔,利用CQT (Constant-Q Transform)变换对音乐信号进行时频分析,提取PCP (Pitch Class Profile)特征,采用稀疏表示分类器(Sparse Representation-based Classification,SRC)进行和弦识别.实验结果表明,提出的特征和识别方法在识别率上均高于传统的方法.  相似文献   

11.
In this paper, a strategy is proposed to deal with a challenging research topic, occluded face recog- nition. Our approach relies on sparse representation on downsampled input image to first locate unoccluded face parts, and then exploits the linear discriminant ability of those pixels to identify the input subject. The advantages and novelties of our method include, 1) since the sparse representation based occlusion detection is conducted on dowsampled image, our algorithm is much faster than classic SRC; 2) the discriminant informa- tion learned from training samples is combined with sparse representation to recognize occluded face for the first time. The verification experiments are conducted on both sinmlated block occlusion images and genuine occluded images.  相似文献   

12.
In image fusion literature, multi-scale transform (MST) and sparse representation (SR) are two most widely used signal/image representation theories. This paper presents a general image fusion framework by combining MST and SR to simultaneously overcome the inherent defects of both the MST- and SR-based fusion methods. In our fusion framework, the MST is firstly performed on each of the pre-registered source images to obtain their low-pass and high-pass coefficients. Then, the low-pass bands are merged with a SR-based fusion approach while the high-pass bands are fused using the absolute values of coefficients as activity level measurement. The fused image is finally obtained by performing the inverse MST on the merged coefficients. The advantages of the proposed fusion framework over individual MST- or SR-based method are first exhibited in detail from a theoretical point of view, and then experimentally verified with multi-focus, visible-infrared and medical image fusion. In particular, six popular multi-scale transforms, which are Laplacian pyramid (LP), ratio of low-pass pyramid (RP), discrete wavelet transform (DWT), dual-tree complex wavelet transform (DTCWT), curvelet transform (CVT) and nonsubsampled contourlet transform (NSCT), with different decomposition levels ranging from one to four are tested in our experiments. By comparing the fused results subjectively and objectively, we give the best-performed fusion method under the proposed framework for each category of image fusion. The effect of the sliding window’s step length is also investigated. Furthermore, experimental results demonstrate that the proposed fusion framework can obtain state-of-the-art performance, especially for the fusion of multimodal images.  相似文献   

13.
Existing face hallucination methods assume that the face images are well-aligned. However, in practice, given a low-resolution face image, it is very difficult to perform precise alignment. As a result, the quality of the super-resolved image is degraded dramatically. In this paper, we propose a near frontal-view face hallucination method which is robust to face image mis-alignment. Based on the discriminative nature of sparse representation, we propose a global face sparse representation model that can reconstruct images with mis-alignment variations. We further propose an iterative method combining the global sparse representation and the local linear regression using the Expectation Maximization (EM) algorithm, in which the face hallucination is converted into a parameter estimation problem with incomplete data. Since the proposed algorithm is independent of the face similarity resulting from precise alignment, the proposed algorithm is robust to mis-alignment. In addition, the proposed iterative manner not only combines the merits of the global and local face hallucination, but also provides a convenient way to integrate different strategies to handle the mis-alignment problem. Experimental results show that the proposed method achieves better performance than existing methods, especially for mis-aligned face images.  相似文献   

14.
The human visual system (HSV) is quite adept at swiftly detecting objects of interest in complex visual scene. Simulating human visual system to detect visually salient regions of an image has been one of the active topics in computer vision. Inspired by random sampling based bagging ensemble learning method, an ensemble dictionary learning (EDL) framework for saliency detection is proposed in this paper. Instead of learning a universal dictionary requiring a large number of training samples to be collected from natural images, multiple over-complete dictionaries are independently learned with a small portion of randomly selected samples from the input image itself, resulting in more flexible multiple sparse representations for each of the image patches. To boost the distinctness of salient patch from background region, we present a reconstruction residual based method for dictionary atom reduction. Meanwhile, with the obtained multiple probabilistic saliency responses for each of the patches, the combination of them is finally carried out from the probabilistic perspective to achieve better predictive performance on saliency region. Experimental results on several open test datasets and some natural images demonstrate that the proposed EDL for saliency detection is much more competitive compared with some existing state-of-the-art algorithms.  相似文献   

15.
Text detection is important in the retrieval of texts from digital pictures, video databases and webpages. However, it can be very challenging since the text is often embedded in a complex background. In this paper, we propose a classification-based algorithm for text detection using a sparse representation with discriminative dictionaries. First, the edges are detected by the wavelet transform and scanned into patches by a sliding window. Then, candidate text areas are obtained by applying a simple classification procedure using two learned discriminative dictionaries. Finally, the adaptive run-length smoothing algorithm and projection profile analysis are used to further refine the candidate text areas. The proposed method is evaluated on the Microsoft common test set, the ICDAR 2003 text locating set, and an image set collected from the web. Extensive experiments show that the proposed method can effectively detect texts of various sizes, fonts and colors from images and videos.  相似文献   

16.
Recent research emphasizes more on analyzing multiple features to improve face recognition (FR) performance. One popular scheme is to extend the sparse representation based classification framework with various sparse constraints. Although these methods jointly study multiple features through the constraints, they just process each feature individually such that they overlook the possible high-level relationship among different features. It is reasonable to assume that the low-level features of facial images, such as edge information and smoothed/low-frequency image, can be fused into a more compact and more discriminative representation based on the latent high-level relationship. FR on the fused features is anticipated to produce better performance than that on the original features, since they provide more favorable properties. Focusing on this, we propose two different strategies which start from fusing multiple features and then exploit the dictionary learning (DL) framework for better FR performance. The first strategy is a simple and efficient two-step model, which learns a fusion matrix from training face images to fuse multiple features and then learns class-specific dictionaries based on the fused features. The second one is a more effective model requiring more computational time that learns the fusion matrix and the class-specific dictionaries simultaneously within an iterative optimization procedure. Besides, the second model considers to separate the shared common components from class-specified dictionaries to enhance the discrimination power of the dictionaries. The proposed strategies, which integrate multi-feature fusion process and dictionary learning framework for FR, realize the following goals: (1) exploiting multiple features of face images for better FR performances; (2) learning a fusion matrix to merge the features into a more compact and more discriminative representation; (3) learning class-specific dictionaries with consideration of the common patterns for better classification performance. We perform a series of experiments on public available databases to evaluate our methods, and the experimental results demonstrate the effectiveness of the proposed models.  相似文献   

17.
In graph embedding based methods, we usually need to manually choose the nearest neighbors and then compute the edge weights using the nearest neighbors via L2 norm (e.g. LLE). It is difficult and unstable to manually choose the nearest neighbors in high dimensional space. So how to automatically construct a graph is very important. In this paper, first, we give a L2-graph like L1-graph. L2-graph calculates the edge weights using the total samples, avoiding manually choosing the nearest neighbors; second, a L2-graph based feature extraction method is presented, called collaborative representation based projections (CRP). Like SPP, CRP aims to preserve the collaborative representation based reconstruction relationship of data. CRP utilizes a L2 norm graph to characterize the local compactness information. CRP maximizes the ratio between the total separability information and the local compactness information to seek the optimal projection matrix. CRP is much faster than SPP since CRP calculates the objective function with L2 norm while SPP calculate the objective function with L1 norm. Experimental results on FERET, AR, Yale face databases and the PolyU finger-knuckle-print database demonstrate that CRP works well in feature extraction and leads to a good recognition performance.  相似文献   

18.
提出了一种基于多特征字典的稀疏表示算法。该算法针对SRC的单特征鉴别性较弱这一不足,对样本提出多个不同特征并分别进行相应的稀疏表示。并根据SRC算法计算各个特征的鉴别性,自适应地学习出稀疏权重并进行线性加权,从而提高分类的性能。实验表明,基于自适应权重的多重稀疏表示分类算法,具有更好的分类效果。  相似文献   

19.
20.
Source recording device recognition is an important emerging research field in digital media forensics. The literature has mainly focused on the source recording device identification problem, whereas few studies have focused on the source recording device verification problem. Sparse representation based classification methods have shown promise for many applications. This paper proposes a source cell phone verification scheme based on sparse representation. It can be further divided into three schemes which utilize exemplar dictionary, unsupervised learned dictionary and supervised learned dictionary respectively. Specifically, the discriminative dictionary learned by supervised learning algorithm, which considers the representational and discriminative power simultaneously compared to the unsupervised learning algorithm, is utilized to further improve the performances of verification systems based on sparse representation. Gaussian supervectors (GSVs) based on MFCCs, which have shown to be effective in capturing the intrinsic characteristics of recording devices, are utilized for constructing and learning dictionary. SCUTPHONE, which is a corpus of speech recordings from 15 cell phones, is presented. Evaluation experiments are conducted on three corpora of speech recordings from cell phones and demonstrate the effectiveness of the proposed methods for cell phone verification. In addition, the influences of number of target examples in the exemplar dictionary and size of the unsupervised learned dictionary on source cell phone verification performance are also analyzed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号