首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Illumination invariant face recognition using near-infrared images   总被引:4,自引:0,他引:4  
Most current face recognition systems are designed for indoor, cooperative-user applications. However, even in thus-constrained applications, most existing systems, academic and commercial, are compromised in accuracy by changes in environmental illumination. In this paper, we present a novel solution for illumination invariant face recognition for indoor, cooperative-user applications. First, we present an active near infrared (NIR) imaging system that is able to produce face images of good condition regardless of visible lights in the environment. Second, we show that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone; based on this, we use local binary pattern (LBP) features to compensate for the monotonic transform, thus deriving an illumination invariant face representation. Then, we present methods for face recognition using NIR images; statistical learning algorithms are used to extract most discriminative features from a large pool of invariant LBP features and construct a highly accurate face matching engine. Finally, we present a system that is able to achieve accurate and fast face recognition in practice, in which a method is provided to deal with specular reflections of active NIR lights on eyeglasses, a critical issue in active NIR image-based face recognition. Extensive, comparative results are provided to evaluate the imaging hardware, the face and eye detection algorithms, and the face recognition algorithms and systems, with respect to various factors, including illumination, eyeglasses, time lapse, and ethnic groups  相似文献   

Multimedia Tools and Applications - Smart imaging devices have been used at a rapid rate in the agriculture sector for the last few years. Fruit recognition and classification is noticed as one of...  相似文献   

Multimedia Tools and Applications - The changes in appearance of faces, usually caused by pose, expression and illumination variations, increase data uncertainty in the task of face recognition....  相似文献   

This paper presents a real-time and robust approach to recognize two types of gestures consisting of seven motional gestures and six finger spelling gestures. This approach utilizes stereo images captured by a stereo webcam to achieve robust recognition under realistic lighting conditions and in various backgrounds. It incorporates several existing computationally efficient techniques and introduces a rule-based approach to merge the information from a pair of stereo images leading to an improved hand detection compared to using single images. The results obtained indicate that high recognition rates under realistic conditions are obtained in real-time on PC platforms at the rate of 30 frames per second. It is shown that its outcome is comparable to two existing approaches while it is computationally more efficient than these approaches.  相似文献   

Although unconstrained face recognition has been widely studied over the recent years, state-of-the-art algorithms still result in an unsatisfactory performance for low-quality images. In this paper, we make two contributions to this field: the first one is the release of a new dataset called ‘AR-LQ’ that can be used in conjunction with the well-known ‘AR’ dataset to evaluate face recognition algorithms on blurred and low-resolution face images. The proposed dataset contains five new blurred faces (at five different levels, from low to severe blurriness) and five new low-resolution images (at five different levels, from 66 × 48 to 7 × 5 pixels) for each of the hundred subjects of the ‘AR’ dataset. The new blurred images were acquired by using a DLSR camera with manual focus that takes an out-of-focus photograph of a monitor that displays a sharp face image. In the same way, the low-resolution images were acquired from the monitor by a DLSR at different distances. Thus, an attempt is made to acquire low-quality images that have been degraded by a real degradation process. Our second contribution is an extension of a known face recognition technique based on sparse representations (ASR) that takes into account low-resolution face images. The proposed method, called blur-ASR or bASR, was designed to recognize faces using dictionaries with different levels of blurriness. These were obtained by digitally blurring the training images, and a sharpness metric for matching blurriness between the query image and the dictionaries. These two main adjustments made the algorithm more robust with respect to low-quality images. In our experiments, bASR consistently outperforms other state-of-the-art methods including hand-crafted features, sparse representations, and seven well-known deep learning face recognition techniques with and without super resolution techniques. On average, bASR obtained 88.8% of accuracy, whereas the rest obtained less than 78.4%.  相似文献   

Cai  Linqin  Xu  Hongbo  Yang  Yang  Yu  Jimin 《Multimedia Tools and Applications》2019,78(20):28591-28607
Multimedia Tools and Applications - Traditional and classical methods of facial expression recognition are mainly based on intensity image and are prone to be disturbed by illumination, poses, and...  相似文献   

Reconstruction and recognition of face and digit images using autoencoders   总被引:1,自引:0,他引:1  
This paper presents techniques for image reconstruction and recognition using autoencoders. Experiments are conducted to compare the performances of three types of autoencoder neural networks based on their efficiency of reconstruction and recognition. Reconstruction error and recognition rate are determined in all the three cases using the same architecture configuration and training algorithm. The results obtained with autoencoders are also compared with those obtained using principal component analysis method. Instead of whole images, image patches are used for training, and this leads to much simpler autoencoder architectures and reduced training time.  相似文献   

Multimedia Tools and Applications - Sketches have been employed since the ancient era of cave paintings for simple illustrations to represent real-world entities and communication. The abstract...  相似文献   

Facial expression and emotion recognition from thermal infrared images has attracted more and more attentions in recent years. However, the features adopted in current work are either temperature statistical parameters extracted from the facial regions of interest or several hand-crafted features that are commonly used in visible spectrum. Till now there are no image features specially designed for thermal infrared images. In this paper, we propose using the deep Boltzmann machine to learn thermal features for emotion recognition from thermal infrared facial images. First, the face is located and normalized from the thermal infrared images. Then, a deep Boltzmann machine model composed of two layers is trained. The parameters of the deep Boltzmann machine model are further fine-tuned for emotion recognition after pre-training of feature learning. Comparative experimental results on the NVIE database demonstrate that our approach outperforms other approaches using temperature statistic features or hand-crafted features borrowed from visible domain. The learned features from the forehead, eye, and mouth are more effective for discriminating valence dimension of emotion than other facial areas. In addition, our study shows that adding unlabeled data from other database during training can also improve feature learning performance.  相似文献   

Facial expression and emotion recognition from thermal infrared images has attracted more and more attentions in recent years. However, the features adopted in current work are either temperature statistical parameters extracted from the facial regions of interest or several hand-crafted features that are commonly used in visible spectrum. Till now there are no image features specially designed for thermal infrared images. In this paper, we propose using the deep Boltzmann machine to learn thermal features for emotion recognition from thermal infrared facial images. First, the face is located and normalized from the thermal infrared im- ages. Then, a deep Boltzmann machine model composed of two layers is trained. The parameters of the deep Boltzmann machine model are further fine-tuned for emotion recognition after pre-tralning of feature learning. Comparative experimental results on the NVIE database demonstrate that our approach outperforms other approaches using temperature statistic features or hand-crafted features borrowed from visible domain. The learned features from the forehead, eye, and mouth are more effective for discriminating valence dimension of emotion than other facial areas. In addition, our study shows that adding unlabeled data from other database during training can also improve feature learning performance.  相似文献   

In this paper, we present a novel approach of recognizing hand number gestures using the recognized hand parts in a depth image. Our proposed approach is divided into two stages: (i) hand parts recognition by random forests (RFs) and (ii) rule-based hand number gestures recognition. In the first stage, we create a database (DB) of synthetic hand depth silhouettes and their corresponding hand parts-labeled maps and then train RFs with the DB. Via the trained RFs, we recognize or label the hand parts in a depth silhouette. In the second stage, based on the information of the recognized or labeled hand parts, hand number gestures are recognized according to our derived rules. In our experiments, we quantitatively and qualitatively evaluated our hand parts recognition system with synthetic and real data. Then, we tested our hand number gesture recognition system with real data. Our results show the average recognition rate of 97.80 % over the ten hand number gestures from five different subjects.  相似文献   

We propose a method for localization and classification of brand logos in natural images. The system has to overcome multiple challenges such as perspective deformations, warping, variations of the shape and colors, occlusions, background variations. To deal with perspective variation, we rely on homography matching between the SIFT keypoints of logo instances of the same class. To address the changes in color, we construct a weighted graph of logo interconnections that is further analyzed to extract potentially multiple instances of the class. The main instance is built by grouping the keypoints of the graph connected logos onto the central image. The secondary instance is needed for color inverted logos and is obtained by inverting the orientation of the main instance. The constructed logo recognition system is tested on two databases (FlickrLogos-32 and BelgaLogos), outperforming state of the art with more than 10 % accuracy.  相似文献   

A novel approach is proposed for the recognition of moving hand gestures based on the representation of hand motions as contour-based similarity images (CBSIs). The CBSI was constructed by calculating the similarity between hand contours in different frames. The input CBSI was then matched with CBSIs in the database to recognize the hand gesture. The proposed continuous hand gesture recognition algorithm can simultaneously divide the continuous gestures into disjointed gestures and recognize them. No restrictive assumptions were considered for the motion of the hand between the disjointed gestures. The proposed algorithm was tested using hand gestures from American Sign Language and the results showed a recognition rate of 91.3% for disjointed gestures and 90.4% for continuous gestures. The experimental results illustrate the efficiency of the algorithm for noisy videos.  相似文献   

As the accuracy of biometrics improves, it is getting increasingly hard to push the limits using a single modality. In this paper, a unified approach that fuses three-dimensional facial and ear data is presented. An annotated deformable model is fitted to the data and a geometry image is extracted. Wavelet coefficients are computed from the geometry image and used as a biometric signature. The method is evaluated using the largest publicly available database and achieves 99.7% rank-one recognition rate. The state-of-the-art accuracy of the multimodal fusion is attributed to the low correlation between the individual differentiability of the two modalities.  相似文献   

Face recognition in hyperspectral images   总被引:3,自引:0,他引:3  
Hyperspectral cameras provide useful discriminants for human face recognition that cannot be obtained by other imaging methods. We examine the utility of using near-infrared hyperspectral images for the recognition of faces over a database of 200 subjects. The hyperspectral images were collected using a CCD camera equipped with a liquid crystal tunable filter to provide 31 bands over the near-infrared (0.7 /spl mu/m-1.0 /spl mu/m). Spectral measurements over the near-infrared allow the sensing of subsurface tissue structure which is significantly different from person to person, but relatively stable over time. The local spectral properties of human tissue are nearly invariant to face orientation and expression which allows hyperspectral discriminants to be used for recognition over a large range of poses and expressions. We describe a face recognition algorithm that exploits spectral measurements for multiple facial tissue types. We demonstrate experimentally that this algorithm can be used to recognize faces over time in the presence of changes in facial pose and expression.  相似文献   

Decision support in equipment condition monitoring systems with image processing is analyzed. Long-run accumulation of information about earlier made decisions is used to realize the adaptiveness of the proposed approach. It is shown that unlike conventional classification problems, the recognition of abnormalities uses training samples supplemented with reward estimates of earlier decisions and can be tackled using reinforcement learning algorithms. We consider the basic stages of contextual multi-armed bandit algorithms during which the probabilistic distributions of each state are evaluated to evaluate the current knowledge of the states, and the decision space is explored to increase the decision-making efficiency. We propose a new decision-making method, which uses the probabilistic neural network to classify abnormal situation and the softmax rule to explore the decision space. A modelling experiment in image processing was carried out to show that our approach allows a higher accuracy of abnormality detection than other known methods, especially for small-size initial training samples.  相似文献   

We present a method for the semi-automatic recognition and mapping of recent rainfall induced shallow landslides. The method exploits VHR panchromatic and HR multispectral satellite images, and was tested in a 9.4 km2 area in Sicily, Italy, where on 1 October 2009 a high intensity rainfall event caused shallow landslides, soil erosion, and inundation. Pre-event and post-event images of the study area taken by the QuickBird satellite, and information on the location and type of landslides obtained in the field and through the interpretation of post-event aerial photographs, were used to construct and validate a set of terrain classification models. The models classify each image element (pixel) based on the probability that the pixel contains (or does not contain) a new rainfall induced landslide. To construct and validate the models, a procedure in five steps was adopted. First, the pre-event and the post-event images were pan-sharpened, ortho-rectified, co-registered, and corrected for atmospheric disturbance. Next, variables describing changes between the pre-event and the post-event images attributed to landslide occurrence were selected. Next, three classification models were calibrated in a training area using different multivariate statistical techniques. The calibrated models were then applied in a validation area using the same set of independent variables, and the same statistical techniques. Lastly, combined terrain classification models were prepared for the training and the validation areas. The performances of the models were evaluated using four-fold plots and receiver operating characteristic curves. The method proved capable of detecting and mapping the new rainfall induced landslides in the study area. We expect the method to be capable of detecting analogous shallow landslides caused by similar (rainfall) or different (e.g. earthquake) triggers, provided that the event slope failures leave discernable features captured by the post-event satellite images, and that the terrain information and satellite images are of adequate quality. The proposed method can facilitate the rapid production of accurate landslide event-inventory maps, and we expect that it will improve our ability to map landslides consistently over large areas. Application of the method will advance our ability to evaluate landslide hazards, and will foster our understanding of the evolution of landscapes shaped by mass-wasting processes.  相似文献   

Neural Computing and Applications - This work is motivated by the tremendous achievement of deep learning models for computer vision tasks, particularly for human activity recognition. It is...  相似文献   

Underwater image processing is very challenging due to its environmental conditions and poor sunlight. Images captured from the ocean using autonomous vehicles are often non-uniformly illuminated and contain noise due to the underlying environment. Object recognition is a challenging task under water due to the variation in the environment, target shape and orientation. Traditional algorithms based on spatial information may not lead to accurate segmentation as the intensity variation is often less in underwater images. Texture information representing the characteristics of the object is needed. Statistical features like autocorrelation, sum average, sum variance and sum entropy were extracted. These were fed as input to learning algorithms and training was done to effectively classify the object of interest and background. Chain coding was further applied for object recognition. The proposed methodology achieved a maximum classification accuracy of 96%.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号