共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we propose Learned Local Gabor Patterns (LLGP) for face representation and recognition. The proposed method is based on Gabor feature and the concept of texton, and defines the feature cliques which appear frequently in Gabor features as the basic patterns. Different from Local Binary Patterns (LBP) whose patterns are predefined, the local patterns in our approach are learned from the patch set, which is constructed by sampling patches from Gabor filtered face images. Thus, the patterns in our approach are face-specific and desirable for face perception tasks. Based on these learned patterns, each facial image is converted into multiple pattern maps and the block-based histograms of these patterns are concatenated together to form the representation of the face image. In addition, we propose an effective weighting strategy to enhance the performances, which makes use of the discriminative powers of different facial parts as well as different patterns. The proposed approach is evaluated on two face databases: FERET and CAS-PEAL-R1. Extensive experimental results and comparisons with existing methods show the effectiveness of the LLGP representation method and the weighting strategy. Especially, heterogeneous testing results show that the LLGP codebook has very impressive generalizability for unseen data. 相似文献
2.
Gabor wavelet representation for 3-D object recognition 总被引:8,自引:0,他引:8
Xing Wu Bir Bhanu 《IEEE transactions on image processing》1997,6(1):47-64
This paper presents a model-based object recognition approach that uses a Gabor wavelet representation. The key idea is to use magnitude, phase, and frequency measures of the Gabor wavelet representation in an innovative flexible matching approach that can provide robust recognition. The Gabor grid, a topology-preserving map, efficiently encodes both signal energy and structural information of an object in a sparse multiresolution representation. The Gabor grid subsamples the Gabor wavelet decomposition of an object model and is deformed to allow the indexed object model match with similar representation obtained using image data. Flexible matching between the model and the image minimizes a cost function based on local similarity and geometric distortion of the Gabor grid. Grid erosion and repairing is performed whenever a collapsed grid, due to object occlusion, is detected. The results on infrared imagery are presented, where objects undergo rotation, translation, scale, occlusion, and aspect variations under changing environmental conditions. 相似文献
3.
In this paper, we develop a novel framework for robust recovery of three-dimensional (3-D) surfaces of faces from single images. The underlying principle is shape from recognition, i.e., the idea that pre-recognizing face parts can constrain the space of possible solutions to the image irradiance equation, thus allowing robust recovery of the 3-D structure of a specific part. Parts of faces like nose, lips and eyes are recognized and localized using robust expansion matching filter templates under varying pose and illumination. Specialized backpropagation based neural networks are then employed to recover the 3-D shape of particular face parts. Representation using principal components allows to efficiently encode classes of objects such as nose, lips, etc. The specialized networks are designed and trained to map the principal component coefficients of the part images to another set of principal component coefficients that represent the corresponding 3-D surface shapes. To achieve robustness to viewing conditions, the network is trained with a wide range of illumination and viewing directions. A method for merging recovered 3-D surface regions by minimizing the sum squared error in overlapping areas is also derived. Quantitative analysis of the reconstruction of the surface parts in varying illumination and pose show relatively small errors, indicating that the method is robust and accurate. Several examples showing recovery of the complete face also illustrate the efficacy of the approach. 相似文献
4.
一种应用于人脸识别的非线性降维方法 总被引:2,自引:0,他引:2
局部线性嵌入算法(locally linear embedding,LLE)作为一种新的非线性维数约减算法,在高维数据可视化方面获得了成功的应用.然而LLE算法获取的特征从分类角度而言并非最优,而且LLE算法难以获取新样本点的低维投影.为解决这两个缺陷,提出了将非线性的LLE算法和线性判别分析算法(linear discriminant analysis,LDA)相结合的一种新的非线性降维方法,通过ORL、Havard和CMU PIE三个人脸库的实验,结果表明,该方法能够大幅度提高识别率,对光照、姿态及表情变化具有一定的鲁棒性. 相似文献
5.
In contrast to holistic methods, local matching methods extract facial features from different levels of locality and quantify them precisely. To determine how they can be best used for face recognition, we conducted a comprehensive comparative study at each step of the local matching process. The conclusions from our experiments include: (1) additional evidence that Gabor features are effective local feature representations and are robust to illumination changes; (2) discrimination based only on a small portion of the face area is surprisingly good; (3) the configuration of facial components does contain rich discriminating information and comparing corresponding local regions utilizes shape features more effectively than comparing corresponding facial components; (4) spatial multiresolution analysis leads to better classification performance; (5) combining local regions with Borda count classifier combination method alleviates the curse of dimensionality. We implemented a complete face recognition system by integrating the best option of each step. Without training, illumination compensation and without any parameter tuning, it achieves superior performance on every category of the FERET test: near perfect classification accuracy (99.5%) on pictures taken on the same day regardless of indoor illumination variations, and significantly better than any other reported performance on pictures taken several days to more than a year apart. The most significant experiments were repeated on the AR database, with similar results. 相似文献
6.
A new machine learning methodology, called successive subspace learning (SSL), is introduced in this work. SSL contains four key ingredients: (1) successive near-to-far neighborhood expansion; (2) unsupervised dimension reduction via subspace approximation; (3) supervised dimension reduction via label-assisted regression (LAG); and (4) feature concatenation and decision making. An image-based object classification method, called PixelHop, is proposed to illustrate the SSL design. It is shown by experimental results that the PixelHop method outperforms the classic CNN model of similar model complexity in three benchmarking datasets (MNIST, Fashion MNIST and CIFAR-10). Although SSL and deep learning (DL) have some high-level concept in common, they are fundamentally different in model formulation, the training process and training complexity. Extensive discussion on the comparison of SSL and DL is made to provide further insights into the potential of SSL. 相似文献
7.
8.
9.
This paper presents a novel coarse to fine moving object segmentation framework for H.264/AVC compressed videos. The proposed framework integrates the global motion estimation and global motion compensation steps in the segmentation pipeline unlike previous techniques which did not consider such an integration. The integration is based on testing for presence of global motion by classifying the interframe motion vectors into moving camera class and still camera class. The decision boundary separating these two classes is learnt from the training video data. The integration automates the moving object segmentation to be applicable for static, moving and combination of static/moving camera cases which to the best of our knowledge has not been carried out earlier. Further, a novel coarse segmentation technique is proposed by decomposing the inter-frame motion vectors into wavelet sub-bands and utilizing logical operations on LH, HL and HH sub-band wavelet coefficients. The premise is based on the fact that since the LH, HL and HH sub-bands contain the detail information pertaining to horizontal, vertical and diagonal moving blocks respectively, they can be exploited to identify the coarse moving boundaries. The coarse segmentation is fast in comparison to state-of-the-art coarse segmentation methods as demonstrated by our experiments. Finally, these coarse boundaries are modeled in an energy minimization framework and shown that by minimizing the energy using graph cut optimization the segmentation is refined to obtain the fine segmentation. The proposed framework is tested on a number of standard video sequences encoded with H.264/AVC JM encoder and comparison is carried out with state-of-the-art compressed domain moving object segmentation methods as well as with an existing state-of-the-art pixel domain method to establish and validate the proposed moving object segmentation framework. 相似文献
10.
In this paper, a novel Gabor-based kernel principal component analysis (PCA) with doubly nonlinear mapping is proposed for human face recognition. In our approach, the Gabor wavelets are used to extract facial features, then a doubly nonlinear mapping kernel PCA (DKPCA) is proposed to perform feature transformation and face recognition. The conventional kernel PCA nonlinearly maps an input image into a high-dimensional feature space in order to make the mapped features linearly separable. However, this method does not consider the structural characteristics of the face images, and it is difficult to determine which nonlinear mapping is more effective for face recognition. In this paper, a new method of nonlinear mapping, which is performed in the original feature space, is defined. The proposed nonlinear mapping not only considers the statistical property of the input features, but also adopts an eigenmask to emphasize those important facial feature points. Therefore, after this mapping, the transformed features have a higher discriminating power, and the relative importance of the features adapts to the spatial importance of the face images. This new nonlinear mapping is combined with the conventional kernel PCA to be called "doubly" nonlinear mapping kernel PCA. The proposed algorithm is evaluated based on the Yale database, the AR database, the ORL database and the YaleB database by using different face recognition methods such as PCA, Gabor wavelets plus PCA, and Gabor wavelets plus kernel PCA with fractional power polynomial models. Experiments show that consistent and promising results are obtained. 相似文献
11.
Transforming an original image into a high-dimensional (HD) feature has been proven to be effective in classifying images. This paper presents a novel feature extraction method utilizing the HD feature space to improve the discriminative ability for face recognition. We observed that the local binary pattern can be decomposed into bit-planes, each of which has scale-specific directional information of the face image. Each bit-plane not only has the inherent local-structure of the face image but also has an illumination-robust characteristic. By concatenating all the decomposed bit-planes, we generate an HD feature vector with an improved discriminative ability. To reduce the computational complexity while preserving the incorporated local structural information, a supervised dimension reduction method, the orthogonal linear discriminant analysis, is applied to the HD feature vector. Extensive experimental results show that existing classifiers with the proposed feature outperform those with other conventional features under various illumination, pose, and expression variations. 相似文献
12.
《Journal of Visual Communication and Image Representation》2014,25(1):24-38
Much of the existing work on action recognition combines simple features with complex classifiers or models to represent an action. Parameters of such models usually do not have any physical meaning nor do they provide any qualitative insight relating the action to the actual motion of the body or its parts. In this paper, we propose a new representation of human actions called sequence of the most informative joints (SMIJ), which is extremely easy to interpret. At each time instant, we automatically select a few skeletal joints that are deemed to be the most informative for performing the current action based on highly interpretable measures such as the mean or variance of joint angle trajectories. We then represent the action as a sequence of these most informative joints. Experiments on multiple databases show that the SMIJ representation is discriminative for human action recognition and performs better than several state-of-the-art algorithms. 相似文献
13.
J. Ruiz-del-Solar P. Navarrete 《IEEE transactions on systems, man and cybernetics. Part C, Applications and reviews》2005,35(3):315-325
Eigenspace-based face recognition corresponds to one of the most successful methodologies for the computational recognition of faces in digital images. Starting with the Eigenface-Algorithm, different eigenspace-based approaches for the recognition of faces have been proposed. They differ mostly in the kind of projection method used (standard, differential, or kernel eigenspace), in the projection algorithm employed, in the use of simple or differential images before/after projection, and in the similarity matching criterion or classification method employed. The aim of this paper is to present an independent comparative study among some of the main eigenspace-based approaches. We believe that carrying out independent studies is relevant, since comparisons are normally performed using the implementations of the research groups that have proposed each method, which does not consider completely equal working conditions for the algorithms. Very often, a contest between the abilities of the research groups rather than a comparison between methods is performed. This study considers theoretical aspects as well as simulations performed using the Yale Face Database, a database with few classes and several images per class, and FERET, a database with many classes and few images per class. 相似文献
14.
Botjan Marui
tefan Dobravec Philippe de Cuetos Cyril Concolato Laurent Piron Jurij F. Tasi
《Signal Processing: Image Communication》2005,20(9-10):947-971
Media delivery over heterogeneous networks requires both flexible representation and robust protection of content. This paper provides details on the framework for audiovisual content creation, delivery, consumption and protection as conceived within the IST project The Innovative Rights and Access Management Interplatform SolUtion. The proposed framework is based on the emerging MPEG-21 standard for multimedia content delivery and consumption and at the same time it complements it in several aspects, most notably by fully specifying a digital rights management (DRM) scheme. Central to the described framework is a novel key management system, relying on smartcards, which addresses many issues that previously blocked wider adoption of DRM: obtrusiveness of the DRM technology perceived by the end-user, flexibility in licence formulation and adequate level of trust as requested by content owners. 相似文献
15.
In this paper, we propose a representation of the face image based on the phase of the 2-D Fourier transform of the image
to overcome the adverse effect of illumination. The phase of the Fourier transform preserves the locations of the edges of
a given face image. The main problem in the use of the phase spectrum is the need for unwrapping of the phase. The problem
of unwrapping is avoided by considering two functions of the phase spectrum rather than the phase directly. Each of these
functions gives partial evidence of the given face image. The effect of noise is reduced by using the first few eigenvectors
of the eigenanalysis on the two phase functions separately. Experimental results on combining the evidences from the two phase
functions show that the proposed method provides an alternative representation of the face images for dealing with the issue
of illumination in face recognition. 相似文献
16.
Yang L Georgescu B Zheng Y Wang Y Meer P Comaniciu D 《IEEE transactions on medical imaging》2011,30(11):1921-1932
Robust and fast 3D tracking of deformable objects, such as heart, is a challenging task because of the relatively low image contrast and speed requirement. Many existing 2D algorithms might not be directly applied on the 3D tracking problem. The 3D tracking performance is limited due to dramatically increased data size, landmarks ambiguity, signal drop-out or complex nonrigid deformation. In this paper, we present a robust, fast, and accurate 3D tracking algorithm: prediction based collaborative trackers (PCT). A novel one-step forward prediction is introduced to generate the motion prior using motion manifold learning. Collaborative trackers are introduced to achieve both temporal consistency and failure recovery. Compared with tracking by detection and 3D optical flow, PCT provides the best results. The new tracking algorithm is completely automatic and computationally efficient. It requires less than 1.5 s to process a 3D volume which contains millions of voxels. In order to demonstrate the generality of PCT, the tracker is fully tested on three large clinical datasets for three 3D heart tracking problems with two different imaging modalities: endocardium tracking of the left ventricle (67 sequences, 1134 3D volumetric echocardiography data), dense tracking in the myocardial regions between the epicardium and endocardium of the left ventricle (503 sequences, roughly 9000 3D volumetric echocardiography data), and whole heart four chambers tracking (20 sequences, 200 cardiac 3D volumetric CT data). Our datasets are much larger than most studies reported in the literature and we achieve very accurate tracking results compared with human experts' annotations and recent literature. 相似文献
17.
Conventional algorithms which have been proposed for the skeletonisation of digital patterns or pictures suffer from basic problems (for example, endpoint identification, connectivity) deriving from the rigid application of fixed rules. This letter introduces a heuristic approach to the problem which overcomes such difficulties and may be tailored to suit different types of data. 相似文献
18.
19.
提出一种新颖的零空间判别投射(NDPE)的子空间人脸识别方法。基于局部保持映射(LPP)和非参数判别分析方法,NDPF能够同时编码人脸数据流形的几何和判别结构,并且通过在零空间中求解特征值来克服小样本尺寸问题。为进一步提高人脸识别的准确率,提出融合双树复小波变换(DTCWT)与NDPE的方法。实验结果表明,所提人脸识别方法在ORL、Yale和AR人脸数据库上均取得了较高的识别率。 相似文献