首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
Image segmentation has been widely used in document image analysis for extraction of printed characters, map processing in order to find lines, legends, and characters, topological features extraction for extraction of geographical information, and quality inspection of materials where defective parts must be delineated among many other applications. In image analysis, the efficient segmentation of images into meaningful objects is important for classification and object recognition. This paper presents two novel methods for segmentation of images based on the Fractional-Order Darwinian Particle Swarm Optimization (FODPSO) and Darwinian Particle Swarm Optimization (DPSO) for determining the n-1 optimal n-level threshold on a given image. The efficiency of the proposed methods is compared with other well-known thresholding segmentation methods. Experimental results show that the proposed methods perform better than other methods when considering a number of different measures.  相似文献   

2.
This paper presents a novel face recognition method by means of fusing color, local spatial and global frequency information. Specifically, the proposed method fuses the multiple features derived from a hybrid color space, the Gabor image representation, the local binary patterns (LBP), and the discrete cosine transform (DCT) of the input image. The novelty of this paper is threefold. First, a hybrid color space, the RCrQ color space, is constructed by combining the R component image of the RGB color space and the chromatic component images, Cr and Q, of the YCbCr and YIQ color spaces, respectively. The RCrQ hybrid color space, whose component images possess complementary characteristics, enhances the discriminating power for face recognition. Second, three effective image encoding methods are proposed for the component images in the RCrQ hybrid color space to extract features: (i) a patch-based Gabor image representation for the R component image, (ii) a multi-resolution LBP feature fusion scheme for the Cr component image, and (iii) a component-based DCT multiple face encoding for the Q component image. Finally, at the decision level, the similarity matrices generated using the three component images in the RCrQ hybrid color space are fused using a weighted sum rule. Experiments on the Face Recognition Grand Challenge (FRGC) version 2 Experiment 4 show that the proposed method improves face recognition performance significantly. In particular, the proposed method achieves the face verification rate (ROC III curve) of 92.43%, at the false accept rate of 0.1%, compared to the FRGC baseline performance of 11.86% face verification rate at the same false accept rate.  相似文献   

3.
Novel image fusion approaches, including physics-based weighted fusion, illumination adjustment and rank-based decision level fusion, for spectral face images are proposed for improving face recognition performance compared to conventional images. A new multispectral imaging system is briefly presented which can acquire continuous spectral face images for our concept proof with fine spectral resolution in the visible spectrum. Several experiments are designed and validated by calculating the cumulative match characteristics of probe sets via the well-known recognition engine-FaceIt®. Experimental results demonstrate that proposed fusion methods outperform conventional images when gallery and probes are acquired under different illuminations and with different time lapses. In the case where probe images are acquired outdoors under different daylight situations, the fused images outperform conventional images by up to 78%.  相似文献   

4.
Recently Sparse Representation (or coding) based Classification (SRC) has gained great success in face recognition. In SRC, the testing image is expected to be best represented as a sparse linear combination of training images from the same class, and the representation fidelity is measured by the ?2-norm or ?1-norm of the coding residual. However, SRC emphasizes the sparsity too much and overlooks the spatial information during local feature encoding process which has been demonstrated to be critical in real-world face recognition problems. Besides, some work considers the spatial information but overlooks the different discriminative ability in different face regions. In this paper, we propose to weight spatial locations based on their discriminative abilities in sparse coding for robust face recognition. Specifically, we learn the weights at face locations according to the information entropy in each face region, so as to highlight locations in face images that are important for classification. Furthermore, in order to construct a robust weights to fully exploit structure information of each face region, we employed external data to learn the weights, which can cover all possible face image variants of different persons, so the robustness of obtained weights can be guaranteed. Finally, we consider the group structure of training images (i.e. those from the same subject) and added an ?2,1-norm (group Lasso) constraint upon the formulation, which enforcing the sparsity at the group level. Extensive experiments on three benchmark face datasets demonstrate that our proposed method is much more robust and effective than baseline methods in dealing with face occlusion, corruption, lighting and expression changes, etc.  相似文献   

5.
Recently, various feature extraction techniques and its variations have been proposed for computer vision. However, most of these techniques are sensitive to the images acquired in uncontrolled environment. The illumination, expression and occlusion for face images result in random error entries in the 2days matrix representing the face. Techniques such as Principal Component Analysis (PCA) do not handle these entries explicitly. This paper proposes a (Two-dimensional)2 whitening reconstruction (T2WR) pre-processing step to be coupled with the PCA algorithm. This combined method would process illumination & expression variations better than standalone PCA. This technique has been compared with state-of-the-art Two-dimensional whitening reconstruction (TWR) pre-processing method. The final results clearly indicate the reason for better performance of T2WR over TWR. The histograms plotted for both these algorithms show that T2WR makes for a smoother frequency distribution than TWR. The proposed method indicated increased recognition rate and accuracy with increasing number of training images; up to 93.82 % for 2 images, 94.76 % for 4 images and 97.42 % for 6 training images.  相似文献   

6.
This paper presents a real-time speech-driven talking face system which provides low computational complexity and smoothly visual sense. A novel embedded confusable system is proposed to generate an efficient phoneme-viseme mapping table which is constructed by phoneme grouping using Houtgast similarity approach based on the results of viseme similarity estimation using histogram distance, according to the concept of viseme visually ambiguous. The generated mapping table can simplify the mapping problem and promote viseme classification accuracy. The implemented real time speech-driven talking face system includes: 1) speech signal processing, including SNR-aware speech enhancement for noise reduction and ICA-based feature set extractions for robust acoustic feature vectors; 2) recognition network processing, HMM and MCSVM are combined as a recognition network approach for phoneme recognition and viseme classification, which HMM is good at dealing with sequential inputs, while MCSVM shows superior performance in classifying with good generalization properties, especially for limited samples. The phoneme-viseme mapping table is used for MCSVM to classify the observation sequence of HMM results, which the viseme class is belong to; 3) visual processing, arranges lip shape image of visemes in time sequence, and presents more authenticity using a dynamic alpha blending with different alpha value settings. Presented by the experiments, the used speech signal processing with noise speech comparing with clean speech, could gain 1.1 % (16.7 % to 15.6 %) and 4.8 % (30.4 % to 35.2 %) accuracy rate improvements in PER and WER, respectively. For viseme classification, the error rate is decreased from 19.22 % to 9.37 %. Last, we simulated a GSM communication between mobile phone and PC for visual quality rating and speech driven feeling using mean opinion score. Therefore, our method reduces the number of visemes and lip shape images by confusable sets and enables real-time operation.  相似文献   

7.
This paper presents a new version of support vector machine (SVM) named l 2 ? l p SVM (0 < p < 1) which introduces the l p -norm (0 < p < 1) of the normal vector of the decision plane in the standard linear SVM. To solve the nonconvex optimization problem in our model, an efficient algorithm is proposed using the constrained concave–convex procedure. Experiments with artificial data and real data demonstrate that our method is more effective than some popular methods in selecting relevant features and improving classification accuracy.  相似文献   

8.
Convolutional neural networks (CNNs) have had great success with regard to the object classification problem. For character classification, we found that training and testing using accurately segmented character regions with CNNs resulted in higher accuracy than when roughly segmented regions were used. Therefore, we expect to extract complete character regions from scene images. Text in natural scene images has an obvious contrast with its attachments. Many methods attempt to extract characters through different segmentation techniques. However, for blurred, occluded, and complex background cases, those methods may result in adjoined or over segmented characters. In this paper, we propose a scene word recognition model that integrates words from small pieces to entire after-cluster-based segmentation. The segmented connected components are classified as four types: background, individual character proposals, adjoined characters, and stroke proposals. Individual character proposals are directly inputted to a CNN that is trained using accurately segmented character images. The sliding window strategy is applied to adjoined character regions. Stroke proposals are considered as fragments of entire characters whose locations are estimated by a stroke spatial distribution system. Then, the estimated characters from adjoined characters and stroke proposals are classified by a CNN that is trained on roughly segmented character images. Finally, a lexicondriven integration method is performed to obtain the final word recognition results. Compared to other word recognition methods, our method achieves a comparable performance on Street View Text and the ICDAR 2003 and ICDAR 2013 benchmark databases. Moreover, our method can deal with recognizing text images of occlusion and improperly segmented text images.  相似文献   

9.
There is a need for a new method of segmentation to improve the efficiency of expert systems that need segmentation. Multilevel thresholding is a widely used technique that uses threshold values for image segmentation. However, from a computational stand point, the search for optimal threshold values presents a challenging task, especially when the number of thresholds is high. To get the optimal threshold values, a meta-heuristic or optimization algorithm is required. Our proposed algorithm is referred to as Rr-cr-IJADE, which is an improved version of Rcr-IJADE. Rr-cr-IJADE uses a newly proposed mutation strategy, “DE/rand-to-rank/1”, to improve the search success rate. The strategy uses the parameter F adaptation, crossover rate repairing, and the direction from a randomly selected individual to a ranking-based leader. The complexity of the proposed algorithm does not increase, compared to its ancestor. The performance of Rr-cr-IJADE, using Otsu's function as the objective function, was evaluated and compared with other state-of-the-art evolutionary algorithms (EAs) and swarm intelligence algorithms (SIs), under both ‘low-level’ and ‘high-level’ experimental sets. Within the ‘low-level’ sets, the number of thresholds varied from 2 to 16, within 20 real images. For the ‘high-level’ sets, the threshold numbers chosen were 24, 32, 40, 48, 56 and 64, within 2 synthetic pseudo images, 7 satellite images, and three real images taken from the set of 20 real images. The proposed Rr-cr-IJADE achieved higher success rates with lower threshold value distortion (TVD) than the other state-of-the-art EA and SI algorithms.  相似文献   

10.
The use of thermal images of a selected area of the head in screening systems, which perform fast and accurate analysis of the temperature distribution of individual areas, requires the use of profiled image analysis methods. There exist methods for automated face analysis which are used at airports or train stations and are designed to detect people with fever. However, they do not enable automatic separation of specific areas of the face. This paper presents an algorithm for image analysis which enables localization of characteristic areas of the face in thermograms. The algorithm is resistant to subjects’ variability and also to changes in the position and orientation of the head. In addition, an attempt was made to eliminate the impact of background and interference caused by hair and hairline. The algorithm automatically adjusts its operation parameters to suit the prevailing room conditions. Compared to previous studies (Marzec et al., J Med Inform Tech 16:151–159, 2010), the set of thermal images was expanded by 34 images. As a result, the research material was a total of 125 patients’ thermograms performed in the Department of Pediatrics and Child and Adolescent Neurology in Katowice, Poland. The images were taken interchangeably with several thermal cameras: AGEMA 590 PAL (sensitivity of 0.1 °C), ThermaCam S65 (sensitivity of 0.08 °C), A310 (sensitivity of 0.05 °C), T335 (sensitivity of 0.05 °C) with a 320?×?240 pixel optical resolution of detectors, maintaining the principles related to taking thermal images for medical thermography. In comparison to (Marzec et al., J Med Inform Tech 16:151–159, 2010), the approach presented there has been extended and modified. Based on the comparison with other methods presented in the literature, it was demonstrated that this method is more complex as it enables to determine the approximate areas of selected parts of the face including anthropometry. As a result of this comparison, better results were obtained in terms of localization accuracy of the center of the eye sockets and nostrils, giving an accuracy of 87 % for the eyes and 93 % for the nostrils.  相似文献   

11.
This paper proposes a novel automatic method for the moment segmentation and peak detection analysis of heart sound (HS) pattern, with special attention to the characteristics of the envelopes of HS and considering the properties of the Hilbert transform (HT). The moment segmentation and peak location are accomplished in two steps. First, by applying the Viola integral waveform method in the time domain, the envelope (ET) of the HS signal is obtained with an emphasis on the first heart sound (S1) and the second heart sound (S2). Then, based on the characteristics of the ET and the properties of the HT of the convex and concave functions, a novel method, the short-time modified Hilbert transform (STMHT), is proposed to automatically locate the moment segmentation and peak points for the HS by the zero crossing points of the STMHT. A fast algorithm for calculating the STMHT of ET can be expressed by multiplying the ET by an equivalent window (WE). According to the range of heart beats and based on the numerical experiments and the important parameters of the STMHT, a moving window width of N = 1 s is validated for locating the moment segmentation and peak points for HS. The proposed moment segmentation and peak location procedure method is validated by sounds from Michigan HS database and sounds from clinical heart diseases, such as a ventricular septal defect (VSD), an aortic septal defect (ASD), Tetralogy of Fallot (TOF), rheumatic heart disease (RHD), and so on. As a result, for the sounds where S2 can be separated from S1, the average accuracies achieved for the peak of S1 (AP1), the peak of S2 (AP2), the moment segmentation points from S1 to S2 (AT12) and the cardiac cycle (ACC) are 98.53%, 98.31% and 98.36% and 97.37%, respectively. For the sounds where S1 cannot be separated from S2, the average accuracies achieved for the peak of S1 and S2 (AP12) and the cardiac cycle ACC are 100% and 96.69%.  相似文献   

12.
13.
This paper discusses approaches for the isolation of deep high aspect ratio through silicon vias (TSV) with respect to a Via Last approach for micro-electro-mechanical systems (MEMS). Selected TSV samples have depths in the range of 170…270 µm and a diameter of 50 µm. The investigations comprise the deposition of different layer stacks by means of subatmospheric and plasma enhanced chemical vapour deposition (PECVD) of tetraethyl orthosilicate; Si(OC2H5)4 (TEOS). Moreover, an etch-back approach and the selective deposition on SiN were also included in the investigations. With respect to the Via Last approach, the contact opening at the TSV bottom by means of a specific spacer-etching method have been addressed within this paper. Step coverage values of up to 74 % were achieved for the best of those approaches. As an alternative to the SiO2-isolation liners a polymer coating based on the CVD of Parylene F was investigated, which yields even higher step coverage in the range of 80 % at the lower TSV sidewall for a surface film thickness of about 1000 nm. Leakage current measurements were performed and values below 0.1 nA/cm2 at 10 kV/cm were determined for the Parylene F films which represents a promising result for the aspired application to Via Last MEMS-TSV.  相似文献   

14.
Similar objects commonly appear in natural images, and locating and cutting out these objects can be tedious when using classical interactive image segmentation methods. In this paper, we propose SimLocator, a robust method oriented to locate and cut out similar objects with minimum user interaction. After extracting an arbitrary object template from the input image, candidate locations of similar objects are roughly detected by distinguishing the shape and color features of each image. A novel optimization method is then introduced to select accurate locations from the two sets of candidates. Additionally, a matting-based method is used to improve the results and to ensure that all similar objects are located in the image. Finally, a method based on alpha matting is utilized to extract the precise object contours. To ensure the performance of the matting operation, this work has developed a new method for foreground extraction. Experiments show that SimLocator is more robust and more convenient to use compared to other more advanced repetition detection and interactive image segmentation methods, in terms of locating similar objects in images.  相似文献   

15.
There has been an increasing interest in face recognition in recent years. Many recognition methods have been developed so far, some very encouraging. A key remaining issue is the existence of variations in the input face image. Today, methods exist that can handle specific image variations. But we are yet to see methods that can be used more effectively in unconstrained situations. This paper presents a method that can handle partial translation, rotation, or scale variations in the input face image. The principal is to automatically identify objects within images using their partial self-similarities. The paper presents two recognition methods which can be used to recognise objects within images. A face recognition system is then presented that is insensitive to limited translation, rotation, or scale variations in the input face image. The performance of the system is evaluated through four experiments. The results show that the system achieves higher recognition rates than those of a number of existing approaches. The author would like to thank the Australian Research Council (ARC) which supports this research with a Discovery Grant.  相似文献   

16.
There are still many challenging problems in facial gender recognition which is mainly due to the complex variances of face appearance. Although there has been tremendous research effort to develop robust gender recognition over the past decade, none has explicitly exploited the domain knowledge of the difference in appearance between male and female. Moustache contributes substantially to the facial appearance difference between male and female and could be a good feature to be incorporated into facial gender recognition. Little work on moustache segmentation has been reported in the literature. In this paper, a novel real-time moustache detection method is proposed which combines face feature extraction, image decolorization and texture detection. Image decolorization, which converts a color image to grayscale, aims to enhance the color contrast while preserving the grayscale. On the other hand, moustache appearance is normally grayscale surrounded by the skin color face tissue. Hence, it is a fast and efficient way to segment the moustache by using the decolorization technology. In order to make the algorithm robust to the variances of illumination and head pose, an adaptive decolorization segmentation has been proposed in which both the segmentation threshold selection and the moustache region following are guided by some special regions defined by their geometric relationship with the salient facial features. Furthermore, a texture-based moustache classifier is developed to compensate the decolorization-based segmentation which could detect the darker skin or shadow around the mouth caused by the small lines or skin thicker from where he/she smiles as moustache. The face is verified as the face containing a moustache only when it satisfies: (1) a larger moustache region can be found by applying the decolorization segmentation; (2) the segmented moustache region is detected as moustache by the texture moustache detector. The experimental results on color FERET database showed that the proposed approach can achieve 89 % moustache face detection rate with 0.1 % false acceptance rate. By incorporating the moustache detector into a facial gender recognition system, the gender recognition accuracy on a large database has been improved from 91 to 93.5 %.  相似文献   

17.
In this paper we propose a new approach for dynamic selection of ensembles of classifiers. Based on the concept named multistage organizations, the main objective of which is to define a multi-layer fusion function adapted to each recognition problem, we propose dynamic multistage organization (DMO), which defines the best multistage structure for each test sample. By extending Dos Santos et al.’s approach, we propose two implementations for DMO, namely DSA m and DSA c . While the former considers a set of dynamic selection functions to generalize a DMO structure, the latter considers contextual information, represented by the output profiles computed from the validation dataset, to conduct this task. The experimental evaluation, considering both small and large datasets, demonstrated that DSA c dominated DSA m on most problems, showing that the use of contextual information can reach better performance than other existing methods. In addition, the performance of DSA c can also be enhanced in incremental learning. However, the most important observation, supported by additional experiments, is that dynamic selection is generally preferred over static approaches when the recognition problem presents a high level of uncertainty.  相似文献   

18.
目的 纹理特征提取一直是遥感图像分析领域研究的热点和难点。现有的纹理特征提取方法主要集中于研究单波段灰色遥感图像,如何提取多波段彩色遥感图像的纹理特征,是多光谱遥感的研究前沿。方法 提出了一种基于流形学习的彩色遥感图像分维数估算方法。该方法利用局部线性嵌入方法,对由颜色属性所组成的5-D欧氏超曲面进行维数简约处理;再将维数简约处理后的颜色属性用于分维数估算。结果 利用Landsat-7遥感卫星数据和GeoEye-1遥感卫星数据进行实验,结果表明,同Peleg法和Sarkar法等其他分维数估算方法相比,本文方法具有较小的拟合误差。其中,其他4种对比方法所获拟合误差E平均值分别是本文方法所获得拟合误差E平均值的26.2倍、5倍、26.3倍、5倍。此外,本文方法不仅可提供具有较好分类特性的分维数,而且还能提供相对于其他4种对比方法更加稳健的分维数。结论 在针对中低分辨率的真彩遥感图像和假彩遥感图像以及高分辨率彩色合成遥感图像方面,本文方法能够利用不同地物所具有颜色属性信息,提取出各类型地物所对应的纹理信息,有效地改善了分维数对不同地物的区分能力。这对后续研究各区域中不同类型地物的分布情况及针对不同类型地物分布特点而制定区域规划及开发具有积极意义。  相似文献   

19.
We propose a novel appearance-based face recognition method called the marginFace approach. By using average neighborhood margin maximization (ANMM), the face images are mapped into a face subspace for analysis. Different from principal component analysis (PCA) and linear discriminant analysis (LDA) which effectively see only the global Euclidean structure of face space, ANMM aims at discriminating face images of different people based on local information. More concretely, for each face image, it pulls the neighboring images of the same person towards it as near as possible, while simultaneously pushing the neighboring images of different people away from it as far as possible. Moreover, we propose an automatic approach for determining the optimal dimensionality of the embedded subspace. The kernelized (nonlinear) and tensorized (multilinear) form of ANMM are also derived in this paper. Finally the experimental results of applying marginFace to face recognition are presented to show the effectiveness of our method.  相似文献   

20.
Color image segmentation: advances and prospects   总被引:57,自引:0,他引:57  
H. D.  X. H.  Y.  Jingli 《Pattern recognition》2001,34(12):2259-2281
Image segmentation is very essential and critical to image processing and pattern recognition. This survey provides a summary of color image segmentation techniques available now. Basically, color segmentation approaches are based on monochrome segmentation approaches operating in different color spaces. Therefore, we first discuss the major segmentation approaches for segmenting monochrome images: histogram thresholding, characteristic feature clustering, edge detection, region-based methods, fuzzy techniques, neural networks, etc.; then review some major color representation methods and their advantages/disadvantages; finally summarize the color image segmentation techniques using different color representations. The usage of color models for image segmentation is also discussed. Some novel approaches such as fuzzy method and physics-based method are investigated as well.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号