首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 50 毫秒
1.
Pattern recognition generally requires that objects be described in terms of a set of measurable features. The selection and quality of the features representing each pattern affect the success of subsequent classification. Feature extraction is the process of deriving new features from original features to reduce the cost of feature measurement, increase classifier efficiency, and allow higher accuracy. Many feature extraction techniques involve linear transformations of the original pattern vectors to new vectors of lower dimensionality. While this is useful for data visualization and classification efficiency, it does not necessarily reduce the number of features to be measured since each new feature may be a linear combination of all of the features in the original pattern vector. Here, we present a new approach to feature extraction in which feature selection and extraction and classifier training are performed simultaneously using a genetic algorithm. The genetic algorithm optimizes a feature weight vector used to scale the individual features in the original pattern vectors. A masking vector is also employed for simultaneous selection of a feature subset. We employ this technique in combination with the k nearest neighbor classification rule, and compare the results with classical feature selection and extraction techniques, including sequential floating forward feature selection, and linear discriminant analysis. We also present results for the identification of favorable water-binding sites on protein surfaces  相似文献   

2.
The Support Vector Machines (SVM) constitute a very powerful technique for pattern classification problems. However, its efficiency in practice depends highly on the selection of the kernel function type and relevant parameter values. Selecting relevant features is another factor that can also impact the performance of SVM. The identification of the best set of parameters values for a classification model such as SVM is considered as an optimization problem. Thus, in this paper, we aim to simultaneously optimize SVMs parameters and feature subset using different kernel functions. We cast this problem as a multi-objective optimization problem, where the classification accuracy, the number of support vectors, the margin and the number of selected features define our objective functions. To solve this optimization problem, a method based on multi-objective genetic algorithm NSGA-II is suggested. A multi-criteria selection operator for our NSGA-II is also introduced. The proposed method is tested on some benchmark data-sets. The experimental results show the efficiency of the proposed method where features were reduced and the classification accuracy has been improved.  相似文献   

3.
张阳  王小宁 《计算机应用》2021,41(11):3151-3155
文本特征是自然语言处理中的关键部分。针对目前文本特征的高维性和稀疏性问题,提出了一种基于Word2Vec词嵌入和高维生物基因选择遗传算法(GARBO)的文本特征选择方法,从而便于后续文本分类任务。首先,优化数据输入形式,使用Word2Vec词嵌入方法将文本转变成类似基因表示的词向量;然后,将高维词向量模拟基因表达方式进行迭代进化;最后,使用随机森林分类器对特征选择后的文本进行分类。使用中文评论数据集对所提出的方法进行实验,实验结果表明了优化后的GARBO特征选择方法在文本特征选择上的有效性,该方法成功地将300维特征降低为50维更有价值的特征,分类准确率达到88%,与其他过滤式文本特征选择方法相比,能够有效地降低文本特征维度,提高文本分类效果。  相似文献   

4.
Spectro-temporal representation of speech has become one of the leading signal representation approaches in speech recognition systems in recent years. This representation suffers from high dimensionality of the features space which makes this domain unsuitable for practical speech recognition systems. In this paper, a new clustering based method is proposed for secondary feature selection/extraction in the spectro-temporal domain. In the proposed representation, Gaussian mixture models (GMM) and weighted K-means (WKM) clustering techniques are applied to spectro-temporal domain to reduce the dimensions of the features space. The elements of centroid vectors and covariance matrices of clusters are considered as attributes of the secondary feature vector of each frame. To evaluate the efficiency of the proposed approach, the tests were conducted for new feature vectors on classification of phonemes in main categories of phonemes in TIMIT database. It was shown that by employing the proposed secondary feature vector, a significant improvement was revealed in classification rate of different sets of phonemes comparing with MFCC features. The average achieved improvements in classification rates of voiced plosives comparing to MFCC features is 5.9% using WKM clustering and 6.4% using GMM clustering. The greatest improvement is about 7.4% which is obtained by using WKM clustering in classification of front vowels comparing to MFCC features.  相似文献   

5.
Classification of intrusion attacks and normal network traffic is a challenging and critical problem in pattern recognition and network security. In this paper, we present a novel intrusion detection approach to extract both accurate and interpretable fuzzy IF-THEN rules from network traffic data for classification. The proposed fuzzy rule-based system is evolved from an agent-based evolutionary framework and multi-objective optimization. In addition, the proposed system can also act as a genetic feature selection wrapper to search for an optimal feature subset for dimensionality reduction. To evaluate the classification and feature selection performance of our approach, it is compared with some well-known classifiers as well as feature selection filters and wrappers. The extensive experimental results on the KDD-Cup99 intrusion detection benchmark data set demonstrate that the proposed approach produces interpretable fuzzy systems, and outperforms other classifiers and wrappers by providing the highest detection accuracy for intrusion attacks and low false alarm rate for normal network traffic with minimized number of features.  相似文献   

6.
Facial features under variant-expressions and partial occlusions could have degrading effect on overall face recognition performance. As a solution, we suggest that the contribution of these features on final classification should be determined. In order to represent facial features contribution according to their variations, we propose a feature selection process that describes facial features as local independent component analysis(ICA) features. These local features are acquired using locally lateral subspace(LLS) strategy.Then, through linear discriminant analysis(LDA) we investigate the intraclass and interclass representation of each local ICA feature and express each feature s contribution via a weighting process. Using these weights, we define the contribution of each feature at local classifier level. In order to recognize faces under single sample constraint, we implement LLS strategy on locally linear embedding(LLE) along with the proposed feature selection. Additionally, we highlight the efficiency of the implementation of LLS strategy. The overall accuracy achieved by our approach on datasets with different facial expressions and partial occlusions such as AR, JAFFE,FERET and CK+ is 90.70%. We present together in this paper survey results on face recognition performance and physiological feature selection performed by human subjects.  相似文献   

7.
8.
Feature selection of very high-resolution (VHR) images is a key prerequisite for supervised classification. However, it is always difficult to acquire the features which have the highest correlation to the type of land cover for improving classification accuracy. To address this problem, this paper proposed a methodology of feature selection using the results of multiple segmentation via genetic algorithm (GA) and correlation feature selection (CFS) integrating sparse auto-encoder (SAE). Firstly, 61 features, including spectral features and spatial features, are extracted from the results of multi-scale segmentation over a WorldView-2 image in Xicheng District, Beijing. Then, 40-dimensional features and 30-dimensional features are derived from the selection with GA+CFS and the optimization with SAE, respectively. Thirdly, the final classification is achieved by logistic regression (LR) based on different subsets of features extracted from the WorldView-2 image. It is found that the result of feature selection could contribute to increase in the intra-species separation and reduction in the inner-species variability. Adding extra lower-ranked features appeared to reduce the accuracy of classification. The results indicate that the overall classification accuracy with 30-dimensional features reached 87.56%, and increased 5.61% compared to the results with 61-dimensional features. For the two kinds of optimized features, the Z-test values are all greater than 1.96, which implied that feature dimensionality reduction and feature space optimization could significantly improve the accuracy of image land cover classification. The texture features in the wavelet domain are the most important features for the study area in the WorldView-2 image classification. Adding wavelet and the grey-level co-occurrence matrix (GLCM) information, especially for GLCM features in wavelet, appeared not to improve classification accuracy. The SAE-based method can produce feature subsets for improving mapping accuracy more efficiently.  相似文献   

9.
基于粗糙集的表情特征选择   总被引:1,自引:1,他引:0       下载免费PDF全文
为解决取得特征向量维数过高问题,提出了一种改进的粗糙集属性约简算法。运用几何特征点方法得到人脸表情的局部特征向量,引入粗糙集理论,用改进的属性约简算法对提取到的表情特征进行优化选择,去掉冗余特征和对表情分类无用的不相关信息。实验结果显示,该方法不仅实现方便,识别率高,识别所用的时间也大大减少,充分表明了该方法的有效性。  相似文献   

10.
Machine learning-based classification techniques provide support for the decision-making process in many areas of health care, including diagnosis, prognosis, screening, etc. Feature selection (FS) is expected to improve classification performance, particularly in situations characterized by the high data dimensionality problem caused by relatively few training examples compared to a large number of measured features. In this paper, a random forest classifier (RFC) approach is proposed to diagnose lymph diseases. Focusing on feature selection, the first stage of the proposed system aims at constructing diverse feature selection algorithms such as genetic algorithm (GA), Principal Component Analysis (PCA), Relief-F, Fisher, Sequential Forward Floating Search (SFFS) and the Sequential Backward Floating Search (SBFS) for reducing the dimension of lymph diseases dataset. Switching from feature selection to model construction, in the second stage, the obtained feature subsets are fed into the RFC for efficient classification. It was observed that GA-RFC achieved the highest classification accuracy of 92.2%. The dimension of input feature space is reduced from eighteen to six features by using GA.  相似文献   

11.
Manifold learning has been successfully applied to facial expression recognition by modeling different expressions as a smooth manifold embedded in a high dimensional space. However, the assumption of single manifold is still arguable and therefore does not necessarily guarantee the best classification accuracy. In this paper, a generalized framework for modeling and recognizing facial expressions on multiple manifolds is presented which assumes that different expressions may reside on different manifolds of possibly different dimensionalities. The intrinsic features of each expression are firstly learned separately and the genetic algorithm (GA) is then employed to obtain the nearly optimal dimensionality of each expression manifold from the classification viewpoint. Classification is performed under a newly defined criterion that is based on the minimum reconstruction error on manifolds. Extensive experiments on both the Cohn-Kanade and Feedtum databases show the effectiveness of the proposed multiple manifold based approach.  相似文献   

12.
针对人脸识别中因特征个数较多对识别的实时性和准确性影响较大的问题,提出了ReliefF-SVM RFE组合式特征选择的人脸识别方法。利用离散余弦变换提取特征和ReliefF对人脸图像特征集做特征初选,降低特征维数空间,再用改进的SVM RFE(Support Vector Machine Recursive Feature Elimination)选择最优特征,解决了利用SVM RFE特征选择时因特征数多而算法需多次训练耗时长的问题。对训练得到的特征排序表采用交叉留一验证方法选取最优子集,再由SVM分类识别。在UMIST人脸库上实验证明,可以在特征数为52时,达到98.84%的识别率,识别时间仅需0.037 s。  相似文献   

13.
面向脸部表情识别的Gabor特征选择方法   总被引:3,自引:0,他引:3  
针对人脸表情识别中Gabor特征向量的高维度信息冗余问题,提出了一个2层Gabor特征选择方法.该方法首先利用改进方差比率作为评估特征的区分能力对高维向量进行过滤,然后对过滤得到的特征子集进行AdaBoost特征选择,以挑选出最具区分度的特征,从而降低了Gabor特征的表示维度.实验结果验证了所提方法的有效性,在训练时间和识别性能两者之间取得了较好的平衡.  相似文献   

14.
Facial expression recognition has recently become an important research area, and many efforts have been made in facial feature extraction and its classification to improve face recognition systems. Most researchers adopt a posed facial expression database in their experiments, but in a real-life situation the facial expressions may not be very obvious. This article describes the extraction of the minimum number of Gabor wavelet parameters for the recognition of natural facial expressions. The objective of our research was to investigate the performance of a facial expression recognition system with a minimum number of features of the Gabor wavelet. In this research, principal component analysis (PCA) is employed to compress the Gabor features. We also discuss the selection of the minimum number of Gabor features that will perform the best in a recognition task employing a multiclass support vector machine (SVM) classifier. The performance of facial expression recognition using our approach is compared with those obtained previously by other researchers using other approaches. Experimental results showed that our proposed technique is successful in recognizing natural facial expressions by using a small number of Gabor features with an 81.7% recognition rate. In addition, we identify the relationship between the human vision and computer vision in recognizing natural facial expressions.  相似文献   

15.
针对传统卷积神经网络在人脸表情识别过程中存在有效特征提取针对性不强、识别准确率不高的问题,提出一种基于多尺度特征注意力机制的人脸表情识别方法。用两层卷积层提取浅层特征信息;在Inception结构基础上并行加入空洞卷积,用来提取人脸表情的多尺度特征信息;引入通道注意力机制,提升模型对重要特征信息的表示能力;最后,将得到的特征输入Softmax层进行分类。通过在公开数据集FER2013和CK+上进行仿真实验,分别取得了68.8%和96.04%的识别准确率,结果表明该方法相比许多经典算法有更好的识别效果。  相似文献   

16.
17.
针对在小样本人脸表情数据库上识别模型过拟合问题,文中提出基于特征优选和字典优化的组稀疏表示分类方法.首先提出特征优选准则,选择相同类级稀疏模式、不同类内稀疏模式的互补特征构建字典.然后对字典进行最大散度差优化学习,使字典在不失真重构特征的同时具有较高鉴别能力.最后联合优化后的字典进行组稀疏表示分类.在JAFFE、CK+...  相似文献   

18.
In our previously developed method for the facial expression recognition of a speaker, the positions of feature vectors in the feature vector space in image processing were generated with imperfections. The imperfections, which caused misrecognition of the facial expression, tended to be far from the center of gravity of the class to which the feature vectors belonged. In the present study, to omit the feature vectors generated with imperfections, a method using reject criteria in the feature vector space was applied to facial expression recognition. Using the proposed method, the facial expressions of two subjects were discriminable with 86.8 % accuracy for the three facial expressions of “happy”, “neutral”, and “others” when they exhibited one of the five intentional facial expressions of “angry”, “happy”, “neutral”, “sad”, and “surprised”, whereas these expressions were discriminable with 78.0 % accuracy by the conventional method. Moreover, the proposed method effectively judged whether the training data were acceptable for facial expression recognition at the moment.  相似文献   

19.
基于邻域粗糙集的多标记分类特征选择算法   总被引:4,自引:0,他引:4  
多标记学习是一类复杂的决策任务,同一个对象可能同时属于多个类别.此类任务在文本分类、图像识别、基因功能分析等领域广泛存在.多标记分类任务往往由高维特征描述,存在大量无关和冗余的信息.目前已经提出了大量的单标记特征选择算法以应对维数灾难问题,但对于多标记的属性约简和特征选择却鲜有研究.将粗糙集应用于多标记数据的特征选择中,针对多标记分类任务,重新定义了邻域粗糙集的下近似和依赖度计算方法,探讨了这一模型的性质,进而构造了基于邻域粗糙集的多标记分类任务的特征选择算法,并给出了在公开数据上的实验结果.实验分析证明算法的有效性.  相似文献   

20.
An innovative and uniform framework based on a combination of Gabor wavelets with principal component analysis (PCA) and multiple discriminant analysis (MDA) is presented in this paper. In this framework, features are extracted from the optimal random image components using greedy approach. These feature vectors are then projected to subspaces for dimensionality reduction which is used for solving linear problems. The design of Gabor filters, PCA and MDA are crucial processes used for facial feature extraction. The FERET, ORL and YALE face databases are used to generate the results. Experiments show that optimal random image component selection (ORICS) plus MDA outperforms ORICS and subspace projection approach such as ORICS plus PCA. Our method achieves 96.25%, 99.44% and 100% recognition accuracy on the FERET, ORL and YALE databases for 30% training respectively. This is a considerably improved performance compared with other standard methodologies described in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号