首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Image segmentation has been widely used in document image analysis for extraction of printed characters, map processing in order to find lines, legends, and characters, topological features extraction for extraction of geographical information, and quality inspection of materials where defective parts must be delineated among many other applications. In image analysis, the efficient segmentation of images into meaningful objects is important for classification and object recognition. This paper presents two novel methods for segmentation of images based on the Fractional-Order Darwinian Particle Swarm Optimization (FODPSO) and Darwinian Particle Swarm Optimization (DPSO) for determining the n-1 optimal n-level threshold on a given image. The efficiency of the proposed methods is compared with other well-known thresholding segmentation methods. Experimental results show that the proposed methods perform better than other methods when considering a number of different measures.  相似文献   

2.
This paper presents a novel face recognition method by means of fusing color, local spatial and global frequency information. Specifically, the proposed method fuses the multiple features derived from a hybrid color space, the Gabor image representation, the local binary patterns (LBP), and the discrete cosine transform (DCT) of the input image. The novelty of this paper is threefold. First, a hybrid color space, the RCrQ color space, is constructed by combining the R component image of the RGB color space and the chromatic component images, Cr and Q, of the YCbCr and YIQ color spaces, respectively. The RCrQ hybrid color space, whose component images possess complementary characteristics, enhances the discriminating power for face recognition. Second, three effective image encoding methods are proposed for the component images in the RCrQ hybrid color space to extract features: (i) a patch-based Gabor image representation for the R component image, (ii) a multi-resolution LBP feature fusion scheme for the Cr component image, and (iii) a component-based DCT multiple face encoding for the Q component image. Finally, at the decision level, the similarity matrices generated using the three component images in the RCrQ hybrid color space are fused using a weighted sum rule. Experiments on the Face Recognition Grand Challenge (FRGC) version 2 Experiment 4 show that the proposed method improves face recognition performance significantly. In particular, the proposed method achieves the face verification rate (ROC III curve) of 92.43%, at the false accept rate of 0.1%, compared to the FRGC baseline performance of 11.86% face verification rate at the same false accept rate.  相似文献   

3.
Novel image fusion approaches, including physics-based weighted fusion, illumination adjustment and rank-based decision level fusion, for spectral face images are proposed for improving face recognition performance compared to conventional images. A new multispectral imaging system is briefly presented which can acquire continuous spectral face images for our concept proof with fine spectral resolution in the visible spectrum. Several experiments are designed and validated by calculating the cumulative match characteristics of probe sets via the well-known recognition engine-FaceIt®. Experimental results demonstrate that proposed fusion methods outperform conventional images when gallery and probes are acquired under different illuminations and with different time lapses. In the case where probe images are acquired outdoors under different daylight situations, the fused images outperform conventional images by up to 78%.  相似文献   

4.
Recently Sparse Representation (or coding) based Classification (SRC) has gained great success in face recognition. In SRC, the testing image is expected to be best represented as a sparse linear combination of training images from the same class, and the representation fidelity is measured by the ?2-norm or ?1-norm of the coding residual. However, SRC emphasizes the sparsity too much and overlooks the spatial information during local feature encoding process which has been demonstrated to be critical in real-world face recognition problems. Besides, some work considers the spatial information but overlooks the different discriminative ability in different face regions. In this paper, we propose to weight spatial locations based on their discriminative abilities in sparse coding for robust face recognition. Specifically, we learn the weights at face locations according to the information entropy in each face region, so as to highlight locations in face images that are important for classification. Furthermore, in order to construct a robust weights to fully exploit structure information of each face region, we employed external data to learn the weights, which can cover all possible face image variants of different persons, so the robustness of obtained weights can be guaranteed. Finally, we consider the group structure of training images (i.e. those from the same subject) and added an ?2,1-norm (group Lasso) constraint upon the formulation, which enforcing the sparsity at the group level. Extensive experiments on three benchmark face datasets demonstrate that our proposed method is much more robust and effective than baseline methods in dealing with face occlusion, corruption, lighting and expression changes, etc.  相似文献   

5.
This paper presents a real-time speech-driven talking face system which provides low computational complexity and smoothly visual sense. A novel embedded confusable system is proposed to generate an efficient phoneme-viseme mapping table which is constructed by phoneme grouping using Houtgast similarity approach based on the results of viseme similarity estimation using histogram distance, according to the concept of viseme visually ambiguous. The generated mapping table can simplify the mapping problem and promote viseme classification accuracy. The implemented real time speech-driven talking face system includes: 1) speech signal processing, including SNR-aware speech enhancement for noise reduction and ICA-based feature set extractions for robust acoustic feature vectors; 2) recognition network processing, HMM and MCSVM are combined as a recognition network approach for phoneme recognition and viseme classification, which HMM is good at dealing with sequential inputs, while MCSVM shows superior performance in classifying with good generalization properties, especially for limited samples. The phoneme-viseme mapping table is used for MCSVM to classify the observation sequence of HMM results, which the viseme class is belong to; 3) visual processing, arranges lip shape image of visemes in time sequence, and presents more authenticity using a dynamic alpha blending with different alpha value settings. Presented by the experiments, the used speech signal processing with noise speech comparing with clean speech, could gain 1.1 % (16.7 % to 15.6 %) and 4.8 % (30.4 % to 35.2 %) accuracy rate improvements in PER and WER, respectively. For viseme classification, the error rate is decreased from 19.22 % to 9.37 %. Last, we simulated a GSM communication between mobile phone and PC for visual quality rating and speech driven feeling using mean opinion score. Therefore, our method reduces the number of visemes and lip shape images by confusable sets and enables real-time operation.  相似文献   

6.
This paper presents a new version of support vector machine (SVM) named l 2 ? l p SVM (0 < p < 1) which introduces the l p -norm (0 < p < 1) of the normal vector of the decision plane in the standard linear SVM. To solve the nonconvex optimization problem in our model, an efficient algorithm is proposed using the constrained concave–convex procedure. Experiments with artificial data and real data demonstrate that our method is more effective than some popular methods in selecting relevant features and improving classification accuracy.  相似文献   

7.
Convolutional neural networks (CNNs) have had great success with regard to the object classification problem. For character classification, we found that training and testing using accurately segmented character regions with CNNs resulted in higher accuracy than when roughly segmented regions were used. Therefore, we expect to extract complete character regions from scene images. Text in natural scene images has an obvious contrast with its attachments. Many methods attempt to extract characters through different segmentation techniques. However, for blurred, occluded, and complex background cases, those methods may result in adjoined or over segmented characters. In this paper, we propose a scene word recognition model that integrates words from small pieces to entire after-cluster-based segmentation. The segmented connected components are classified as four types: background, individual character proposals, adjoined characters, and stroke proposals. Individual character proposals are directly inputted to a CNN that is trained using accurately segmented character images. The sliding window strategy is applied to adjoined character regions. Stroke proposals are considered as fragments of entire characters whose locations are estimated by a stroke spatial distribution system. Then, the estimated characters from adjoined characters and stroke proposals are classified by a CNN that is trained on roughly segmented character images. Finally, a lexicondriven integration method is performed to obtain the final word recognition results. Compared to other word recognition methods, our method achieves a comparable performance on Street View Text and the ICDAR 2003 and ICDAR 2013 benchmark databases. Moreover, our method can deal with recognizing text images of occlusion and improperly segmented text images.  相似文献   

8.
The use of thermal images of a selected area of the head in screening systems, which perform fast and accurate analysis of the temperature distribution of individual areas, requires the use of profiled image analysis methods. There exist methods for automated face analysis which are used at airports or train stations and are designed to detect people with fever. However, they do not enable automatic separation of specific areas of the face. This paper presents an algorithm for image analysis which enables localization of characteristic areas of the face in thermograms. The algorithm is resistant to subjects’ variability and also to changes in the position and orientation of the head. In addition, an attempt was made to eliminate the impact of background and interference caused by hair and hairline. The algorithm automatically adjusts its operation parameters to suit the prevailing room conditions. Compared to previous studies (Marzec et al., J Med Inform Tech 16:151–159, 2010), the set of thermal images was expanded by 34 images. As a result, the research material was a total of 125 patients’ thermograms performed in the Department of Pediatrics and Child and Adolescent Neurology in Katowice, Poland. The images were taken interchangeably with several thermal cameras: AGEMA 590 PAL (sensitivity of 0.1 °C), ThermaCam S65 (sensitivity of 0.08 °C), A310 (sensitivity of 0.05 °C), T335 (sensitivity of 0.05 °C) with a 320?×?240 pixel optical resolution of detectors, maintaining the principles related to taking thermal images for medical thermography. In comparison to (Marzec et al., J Med Inform Tech 16:151–159, 2010), the approach presented there has been extended and modified. Based on the comparison with other methods presented in the literature, it was demonstrated that this method is more complex as it enables to determine the approximate areas of selected parts of the face including anthropometry. As a result of this comparison, better results were obtained in terms of localization accuracy of the center of the eye sockets and nostrils, giving an accuracy of 87 % for the eyes and 93 % for the nostrils.  相似文献   

9.
10.
Similar objects commonly appear in natural images, and locating and cutting out these objects can be tedious when using classical interactive image segmentation methods. In this paper, we propose SimLocator, a robust method oriented to locate and cut out similar objects with minimum user interaction. After extracting an arbitrary object template from the input image, candidate locations of similar objects are roughly detected by distinguishing the shape and color features of each image. A novel optimization method is then introduced to select accurate locations from the two sets of candidates. Additionally, a matting-based method is used to improve the results and to ensure that all similar objects are located in the image. Finally, a method based on alpha matting is utilized to extract the precise object contours. To ensure the performance of the matting operation, this work has developed a new method for foreground extraction. Experiments show that SimLocator is more robust and more convenient to use compared to other more advanced repetition detection and interactive image segmentation methods, in terms of locating similar objects in images.  相似文献   

11.
In this paper we propose a new approach for dynamic selection of ensembles of classifiers. Based on the concept named multistage organizations, the main objective of which is to define a multi-layer fusion function adapted to each recognition problem, we propose dynamic multistage organization (DMO), which defines the best multistage structure for each test sample. By extending Dos Santos et al.’s approach, we propose two implementations for DMO, namely DSA m and DSA c . While the former considers a set of dynamic selection functions to generalize a DMO structure, the latter considers contextual information, represented by the output profiles computed from the validation dataset, to conduct this task. The experimental evaluation, considering both small and large datasets, demonstrated that DSA c dominated DSA m on most problems, showing that the use of contextual information can reach better performance than other existing methods. In addition, the performance of DSA c can also be enhanced in incremental learning. However, the most important observation, supported by additional experiments, is that dynamic selection is generally preferred over static approaches when the recognition problem presents a high level of uncertainty.  相似文献   

12.
We propose a novel appearance-based face recognition method called the marginFace approach. By using average neighborhood margin maximization (ANMM), the face images are mapped into a face subspace for analysis. Different from principal component analysis (PCA) and linear discriminant analysis (LDA) which effectively see only the global Euclidean structure of face space, ANMM aims at discriminating face images of different people based on local information. More concretely, for each face image, it pulls the neighboring images of the same person towards it as near as possible, while simultaneously pushing the neighboring images of different people away from it as far as possible. Moreover, we propose an automatic approach for determining the optimal dimensionality of the embedded subspace. The kernelized (nonlinear) and tensorized (multilinear) form of ANMM are also derived in this paper. Finally the experimental results of applying marginFace to face recognition are presented to show the effectiveness of our method.  相似文献   

13.
There is a need for a new method of segmentation to improve the efficiency of expert systems that need segmentation. Multilevel thresholding is a widely used technique that uses threshold values for image segmentation. However, from a computational stand point, the search for optimal threshold values presents a challenging task, especially when the number of thresholds is high. To get the optimal threshold values, a meta-heuristic or optimization algorithm is required. Our proposed algorithm is referred to as Rr-cr-IJADE, which is an improved version of Rcr-IJADE. Rr-cr-IJADE uses a newly proposed mutation strategy, “DE/rand-to-rank/1”, to improve the search success rate. The strategy uses the parameter F adaptation, crossover rate repairing, and the direction from a randomly selected individual to a ranking-based leader. The complexity of the proposed algorithm does not increase, compared to its ancestor. The performance of Rr-cr-IJADE, using Otsu's function as the objective function, was evaluated and compared with other state-of-the-art evolutionary algorithms (EAs) and swarm intelligence algorithms (SIs), under both ‘low-level’ and ‘high-level’ experimental sets. Within the ‘low-level’ sets, the number of thresholds varied from 2 to 16, within 20 real images. For the ‘high-level’ sets, the threshold numbers chosen were 24, 32, 40, 48, 56 and 64, within 2 synthetic pseudo images, 7 satellite images, and three real images taken from the set of 20 real images. The proposed Rr-cr-IJADE achieved higher success rates with lower threshold value distortion (TVD) than the other state-of-the-art EA and SI algorithms.  相似文献   

14.
Recently, various feature extraction techniques and its variations have been proposed for computer vision. However, most of these techniques are sensitive to the images acquired in uncontrolled environment. The illumination, expression and occlusion for face images result in random error entries in the 2days matrix representing the face. Techniques such as Principal Component Analysis (PCA) do not handle these entries explicitly. This paper proposes a (Two-dimensional)2 whitening reconstruction (T2WR) pre-processing step to be coupled with the PCA algorithm. This combined method would process illumination & expression variations better than standalone PCA. This technique has been compared with state-of-the-art Two-dimensional whitening reconstruction (TWR) pre-processing method. The final results clearly indicate the reason for better performance of T2WR over TWR. The histograms plotted for both these algorithms show that T2WR makes for a smoother frequency distribution than TWR. The proposed method indicated increased recognition rate and accuracy with increasing number of training images; up to 93.82 % for 2 images, 94.76 % for 4 images and 97.42 % for 6 training images.  相似文献   

15.
Leaf area index (LAI) is one of the most important plant parameters when observing agricultural crops and a decisive factor for yield estimates. Remote-sensing data provide spectral information on large areas and allow for a detailed quantitative assessment of LAI and other plant parameters. The present study compared support vector regression (SVR), random forest regression (RFR), and partial least-squares regression (PLSR) and their achieved model qualities for the assessment of LAI from wheat reflectance spectra. In this context, the validation technique used for verifying the accuracy of an empirical–statistical regression model was very important in order to allow the spatial transferability of models to unknown data. Thus, two different validation methods, leave-one-out cross-validation (cv) and independent validation (iv), were performed to determine model accuracy. The LAI and field reflectance spectra of 124 plots were collected from four fields during two stages of plant development in 2011 and 2012. In the case of cross-validation for the separate years, as well as the entire data set, SVR provided the best results (2011: R2cv = 0.739, 2012: R2cv = 0.85, 2011 and 2012: R2cv = 0.944). Independent validation of the data set from both years led to completely different results. The accuracy of PLSR (R2iv = 0.912) and RFR (R2iv = 0.770) remained almost at the same level as that of cross-validation, while SVR showed a clear decline in model performance (R2iv = 0.769). The results indicate that regression model robustness largely depends on the applied validation approach and the data range of the LAI used for model building.  相似文献   

16.
Color image segmentation: advances and prospects   总被引:57,自引:0,他引:57  
H. D.  X. H.  Y.  Jingli 《Pattern recognition》2001,34(12):2259-2281
Image segmentation is very essential and critical to image processing and pattern recognition. This survey provides a summary of color image segmentation techniques available now. Basically, color segmentation approaches are based on monochrome segmentation approaches operating in different color spaces. Therefore, we first discuss the major segmentation approaches for segmenting monochrome images: histogram thresholding, characteristic feature clustering, edge detection, region-based methods, fuzzy techniques, neural networks, etc.; then review some major color representation methods and their advantages/disadvantages; finally summarize the color image segmentation techniques using different color representations. The usage of color models for image segmentation is also discussed. Some novel approaches such as fuzzy method and physics-based method are investigated as well.  相似文献   

17.
Recent years have witnessed great progress in image deblurring. However, as an important application case, the deblurring of face images has not been well studied. Most existing face deblurring methods rely on exemplar set construction and candidate matching, which not only cost much computation time but also are vulnerable to possible complex or exaggerated face variations. To address the aforementioned problems, we propose a novel face deblurring method by integrating classical L 0 deblurring approach with face landmark detection. A carefully tailored landmark detector is used to detect the main face contours. Then the detected contours are used as salient edges to guide the blind image deconvolution. Extensive experimental results demonstrate that the proposed method can better handle various complex face poses, shapes and expressions while greatly reducing computation time, as compared with existing state-of-the-art approaches.  相似文献   

18.
Reconstructing 3D face models from 2D face images is usually done by using a single reference 3D face model or some gender/ethnicity specific 3D face models. However, different persons, even those of the same gender or ethnicity, usually have significantly different faces in terms of their overall appearance, which forms the base of person recognition via faces. Consequently, existing 3D reference model based methods have limited capability of reconstructing precise 3D face models for a large variety of persons. In this paper, we propose to explore a reservoir of diverse reference models for 3D face reconstruction from forensic mugshot face images, where facial examplars coherent with the input determine the final shape estimation. Specifically, our 3D face reconstruction is formulated as an energy minimization problem with: 1) shading constraint from multiple input face images, 2) distortion and self-occlusion based color consistency between different views, and 3) depth uncertainty based smoothness constraint on adjacent pixels. The proposed energy is minimized in a coarse to fine way, where the shape refinement step is done by using a multi-label segmentation algorithm. Experimental results on challenging datasets demonstrate that the proposed algorithm is capable of recovering high quality 3D face models. We also show that our reconstructed models successfully boost face recognition accuracy.  相似文献   

19.
This paper discusses approaches for the isolation of deep high aspect ratio through silicon vias (TSV) with respect to a Via Last approach for micro-electro-mechanical systems (MEMS). Selected TSV samples have depths in the range of 170…270 µm and a diameter of 50 µm. The investigations comprise the deposition of different layer stacks by means of subatmospheric and plasma enhanced chemical vapour deposition (PECVD) of tetraethyl orthosilicate; Si(OC2H5)4 (TEOS). Moreover, an etch-back approach and the selective deposition on SiN were also included in the investigations. With respect to the Via Last approach, the contact opening at the TSV bottom by means of a specific spacer-etching method have been addressed within this paper. Step coverage values of up to 74 % were achieved for the best of those approaches. As an alternative to the SiO2-isolation liners a polymer coating based on the CVD of Parylene F was investigated, which yields even higher step coverage in the range of 80 % at the lower TSV sidewall for a surface film thickness of about 1000 nm. Leakage current measurements were performed and values below 0.1 nA/cm2 at 10 kV/cm were determined for the Parylene F films which represents a promising result for the aspired application to Via Last MEMS-TSV.  相似文献   

20.
目的 人脸姿态偏转是影响人脸识别准确率的一个重要因素,本文利用3维人脸重建中常用的3维形变模型以及深度卷积神经网络,提出一种用于多姿态人脸识别的人脸姿态矫正算法,在一定程度上提高了大姿态下人脸识别的准确率。方法 对传统的3维形变模型拟合方法进行改进,利用人脸形状参数和表情参数对3维形变模型进行建模,针对面部不同区域的关键点赋予不同的权值,加权拟合3维形变模型,使得具有不同姿态和面部表情的人脸图像拟合效果更好。然后,对3维人脸模型进行姿态矫正并利用深度学习对人脸图像进行修复,修复不规则的人脸空洞区域,并使用最新的局部卷积技术同时在新的数据集上重新训练卷积神经网络,使得网络参数达到最优。结果 在LFW(labeled faces in the wild)人脸数据库和StirlingESRC(Economic Social Research Council)3维人脸数据库上,将本文算法与其他方法进行比较,实验结果表明,本文算法的人脸识别精度有一定程度的提高。在LFW数据库上,通过对具有任意姿态的人脸图像进行姿态矫正和修复后,本文方法达到了96.57%的人脸识别精确度。在StirlingESRC数据库上,本文方法在人脸姿态为±22°的情况下,人脸识别准确率分别提高5.195%和2.265%;在人脸姿态为±45°情况下,人脸识别准确率分别提高5.875%和11.095%;平均人脸识别率分别提高5.53%和7.13%。对比实验结果表明,本文提出的人脸姿态矫正算法有效提高了人脸识别的准确率。结论 本文提出的人脸姿态矫正算法,综合了3维形变模型和深度学习模型的优点,在各个人脸姿态角度下,均能使人脸识别准确率在一定程度上有所提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号