首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
刘云  肖雪  黄荣乘 《信息技术》2020,(5):28-31,36
特征选择是机器学习和数据挖掘中处理高维数据的初步步骤,通过消除冗余或不相关的特征来识别数据集中最重要和最相关的特征,从而提高分类精度和降低计算复杂度。文中提出混合蒙特卡罗树搜索特征选择算法(HMCTS),首先,根据蒙特卡罗树搜索方法迭代生成一个初始特征子集,利用ReliefF算法过滤选择前k个特征形成候选特征子集;然后,利用KNN分类器的分类精度评估候选特征,通过反向传播将模拟结果更新到迭代路径上所有选择的节点;最后,选择高精度的候选特征作为最佳特征子集。仿真结果表明,对比HPSO-LS和MOTiFS算法,HMCTS算法具有良好的可扩展性,且分类精度高。  相似文献   

2.
The classification and detection of event-related brain potentials was investigated using signal processing and statistical pattern recognition techniques. Amplitudes at sampled time points and frequency quantities have previously been used as features. Improvements to these procedures were obtained by using features from the time-frequency plane to utilize the geometric relationship between time and frequency, capitalizing on the nonstationarity of the evoked potential signals. These features were transformed from the original data sets based upon a two-step classification/feature selection procedure which uses selected frequencies from step 1 as parameters for data filtering in step 2. Features were selected from the filtered data, classifiers were designed, and the estimated classification accuracies were computed.  相似文献   

3.
入侵检测中基于SVM的两级特征选择方法   总被引:2,自引:0,他引:2  
针对入侵检测中的特征优化选择问题,提出基于支持向量机的两级特征选择方法。该方法将基于检测率与误报率比值的特征评测值作为特征筛选的评价指标,先采用过滤模式中的Fisher分和信息增益分别过滤噪声和无关特征,降低特征维数;再基于筛选出来的交叉特征子集,采用封装模式中的序列后向搜索算法,结合支持向量机选取最优特征子集。仿真测试结果表明,采用该方法筛选出来的特征子集具有更好的分类性能,并有效降低了系统的建模时间和测试时间。  相似文献   

4.
Li ZHANG  Cong WANG 《通信学报》2018,39(5):111-122
Feature selection has played an important role in machine learning and artificial intelligence in the past decades.Many existing feature selection algorithm have chosen some redundant and irrelevant features,which is leading to overestimation of some features.Moreover,more features will significantly slow down the speed of machine learning and lead to classification over-fitting.Therefore,a new nonlinear feature selection algorithm based on forward search was proposed.The algorithm used the theory of mutual information and mutual information to find the optimal subset associated with multi-task labels and reduced the computational complexity.Compared with the experimental results of nine datasets and four different classifiers in UCI,the proposed algorithm is superior to the feature set selected by the original feature set and other feature selection algorithms.  相似文献   

5.
基于差异性和重要性的问句特征组合   总被引:1,自引:0,他引:1       下载免费PDF全文
在问答系统问句分类研究中,对问句特征进行组合有助于构造高效的问句分类器.针对当前问句分类中的特征组合问题,提出一种基于差异性和重要性的特征组合 (Diversity and Importance based Feature Combination,DIFC)方法.通过计算待组合特征与当前特征组合的错分差异度和正分差异度,以及待组合特征本身的重要度,从候选特征集中动态获取优化的特征组合.在哈工大中文问句集上对词袋绑定特征进行组合的实验结果表明,与其他特征组合方法相比,DIFC方法灵活高效,准确率更高.  相似文献   

6.
杨棉绒  牛丽平 《红外与激光工程》2022,51(4):20210309-1-20210309-6
红外传感技术有效解决了夜间观测的难题,成为现代战场侦察的重要手段之一。不断提升基于红外图像的目标识别能力是实施精确打击、态势感知的有力途径。针对红外图像识别问题,提出基于轻量级梯度提升机(Light Gradient Boosting Machine, LGBM)的Zernike特征选取算法,并结合稀疏表示分类器(Sparse Representation-based Classification, SRC)完成目标类别确认。首先,基于红外图像中的目标区域提取多阶Zernike矩特征,表征待识别目标的本质特性;其次,采用LGBM特征选择算法对多阶矩特征进行二次筛选,减少冗余的同时提高特征的针对性;最后,基于SRC对最终选择的Zernike矩特征矢量进行分类。该方法通过LGBM的特征选择有效提高了最终特征的有效性,同时降低了分类的计算复杂度,有利于提高整体识别性能。采用公开的中波红外目标图像数据集(MWIR)开展验证实验,对10类典型军事目标进行区分识别。实验分别在原始样本、噪声干扰样本以及部分缺失样本三种条件下进行并与几类现有红外目标识别方法进行对比讨论。结果表明:所提方法可取得更优性能,证明其有效性。  相似文献   

7.
We consider a particular paradigm of steganalysis, namely, highly imbalanced steganalysis with small training samples, in which the cover images always significantly outnumber the stego ones. Researchers have rigorously studied sampling and learning algorithms as well as feature selection approaches to the class imbalance problem, but the research in the steganalysis domain is rare. This study provides a systematic comparison of eight feature selection metrics and of three types of methods developed for the imbalanced data classification problem in the steganalysis domain. Each metric is compared across three different classifiers and four steganalytic features. The efficiency of the metrics is evaluated to determine which performs best with minimal features selected. The performance of the three types of methods and their combinations is examined. Moreover, we also investigate the effect of feature dimensionality, sample number and imbalance degree on the performance of feature selection inresolving imbalanced image steganalysis.  相似文献   

8.
为了提高不平衡数据集分类中少数类的分类精度,提出了基于特征选择的过抽样算法.该算法考虑了不同的特征列对分类性能的不同作用,首先对训练集进行特征选择,选出一组特征列,然后根据选出的特征列合成少数类样本,合成的每个少数类样本的特征由两部分组成,一部分是特征选择的特征列对应的特征,另一部分是按照SMOTE原理合成的特征.将基于特征选择的过抽样算法和SMOTE算法进行实验比较,结果表明基于特征选择的过抽样算法的性能优于SMOTE算法,能有效降低数据的不平衡性,提高少数类的分类精度.  相似文献   

9.
Feature selection algorithm based on XGBoost   总被引:2,自引:0,他引:2  
Feature selection in classification has always been an important but difficult problem.This kind of problem requires that feature selection algorithms can not only help classifiers to improve the classification accuracy,but also reduce the redundant features as much as possible.Therefore,in order to solve feature selection in the classification problems better,a new wrapped feature selection algorithm XGBSFS was proposed.The thought process of building trees in XGBoost was used for reference,and the importance of features from three importance metrics was measured to avoid the limitation of single importance metric.Then the improved sequential floating forward selection (ISFFS) was applied to search the feature subset so that it had high quality.Compared with the experimental results of eight datasets in UCI,the proposed algorithm has good performance.  相似文献   

10.
In this article, we propose a novel system for feature selection, which is one of the key problems in content-based image indexing and retrieval as well as various other research fields such as pattern classification and genomic data analysis. The proposed system aims at enhancing semantic image retrieval results, decreasing retrieval process complexity, and improving the overall system usability for end-users of multimedia search engines. Three feature selection criteria and a decision method construct the feature selection system. Two novel feature selection criteria based on inner-cluster and intercluster relations are proposed in the article. A majority voting-based method is adapted for efficient selection of features and feature combinations. The performance of the proposed criteria is assessed over a large image database and a number of features, and is compared against competing techniques from the literature. Experiments show that the proposed feature selection system improves semantic performance results in image retrieval systems. This work was supported by the Academy of Finland, Project No. 213,462 (Finnish Centre of Excellence Program 2006–2011).  相似文献   

11.
侯榜焕  姚敏立  贾维敏  沈晓卫  金伟 《红外与激光工程》2017,46(12):1228001-1228001(8)
高光谱遥感图像具有特征(波段)数多、冗余度高等特点,因此特征选择成为高光谱分类的研究热点。针对此问题,提出了空间结构与光谱结构同时保持的高光谱数据分类算法。考虑高光谱图像的物理特性,首先对图像进行加权空谱重构,使图像的空间结构信息自动融入光谱特征,形成空谱特征集;对利用最小二乘回归模型保存数据集的全局相似性结构的基础上,加入局部流形结构正则项,使挑选的特征子集更好地保存数据集的内在本质结构;讨论了窗口大小和正则参数对分类精度的影响。对Indian Pines、PaviaU和Salinas数据集的实验表明,该算法得到的特征子集的总体分类精度达到93.22%、96.01%和95.90%。该算法不仅充分利用了高光谱图像的空间结构信息,而且深入挖掘了数据集的内在本质结构,从而得到更有鉴别性的特征子集,相比传统方法明显提高了分类精度。  相似文献   

12.
基于CHI与遗传算法的特征选择   总被引:1,自引:0,他引:1  
在基于Web文本信息过滤系统中通过特征选择找到的最优特征子集直接影响到分类的速度及精度。针对此问题,提出了综合CHI及遗传算法的特征选择方法。首先针对原始特征集,采用CHI统计法进行初始筛选,去除冗余特征及噪声后,对得到的特征子集再采用遗传算法进行第二次特征选择,从而得出代表问题空间的最优特征子集,实现降维并提高了分类精度。  相似文献   

13.
针对网络流量分类过程中出现的类不平衡问题,该文提出一种基于加权对称不确定性(WSU)和近似马尔科夫毯(AMB)的特征选择算法。首先,根据类别分布信息,定义了偏向于小类别的特征度量,使得与小类别具有强相关性的特征更容易被选择出来;其次,充分考虑特征与类别间、特征与特征之间的相关性,利用加权对称不确定性和近似马尔科夫毯删除不相关特征及冗余特征;最后,利用基于相关性度量的特征评估函数以及序列搜索算法进一步降低特征维数,确定最优特征子集。实验表明,在保证算法整体分类精确率的前提下,算法能够有效提高小类别的分类性能。  相似文献   

14.
Feature selection (FS) is a process to select features which are more informative.It is one of the important steps in knowledge discovery.The problem is that not all features are important.Some of the features may be redundant,and others may be irrelevant and noisy.The conventional supervised FS methods evaluate various feature subsets using an evaluation function or metric to select only those features which are related to the decision classes of the data under consideration.However,for many data mining applications,decision class labels are often unknown or incomplete,thus indicating the significance of unsupervised feature selection.However,in unsupervised learning,decision class labels are not provided.In this paper,we propose a new unsupervised quick reduct (QR) algorithm using rough set theory.The quality of the reduced data is measured by the classification performance and it is evaluated using WEKA classifier tool.The method is compared with existing supervised methods and the result demonstrates the efficiency of the proposed algorithm.  相似文献   

15.
特征选择是目标分类的一项重要步骤,直接影响到分类器的设计和性能。本文利用实际水声目标辐射噪声数据,对遗传算法和互信息算法两种特征选择方法分别作了分析。在特征维数较大的情况下,两种方法都需要很长的计算时间,为此,提出一种遗传与互信息混合算法,旨在降低计算时间。最后,分类器用三种选择后的特征子集作为输入进行分类,并与任意选择的特征子集作为输入的分类结果作了比较。  相似文献   

16.
针对网络流量分类过程中出现的类不平衡问题,该文提出一种基于加权对称不确定性(WSU)和近似马尔科夫毯(AMB)的特征选择算法.首先,根据类别分布信息,定义了偏向于小类别的特征度量,使得与小类别具有强相关性的特征更容易被选择出来;其次,充分考虑特征与类别间、特征与特征之间的相关性,利用加权对称不确定性和近似马尔科夫毯删除...  相似文献   

17.
一种基于粗集理论的视频流派分类方法   总被引:1,自引:0,他引:1  
视频分类提供了一种管理和利用视频数据的有效手段.现有的视频流派分类方法倾向于使用尽量多的特征,借此更有效地表示视频内容,以保证分类的效果,但提取这些视频特征的代价通常都很高,因此有必要考虑流派分类中的特征选择问题.提出了一种基于粗集理论的方法实现视频特征的选择和流派的分类:通过分析相关文献中所使用的各种特征.提取了多种有效特征构成分类的基础;基于启发式搜索的方法用于发现最优约简,从而实现特征选择;通过约简导出的分类规则实现流派标识的确定.与已有方法分类效果的比较以及与决策树方法的实验对比表明了文中方法的有效性.  相似文献   

18.
Automated melanoma recognition   总被引:3,自引:0,他引:3  
A system for the computerized analysis of images obtained from ELM has been developed to enhance the early recognition of malignant melanoma. As an initial step, the binary mask of the skin lesion is determined by several basic segmentation algorithms together with a fusion strategy. A set of features containing shape and radiometric features as well as local and global parameters is calculated to describe the malignancy of a lesion. Significant features are then selected from this set by application of statistical feature subset selection methods. The final kNN classification delivers a sensitivity of 87% with a specificity of 92%.  相似文献   

19.
Identification of the short transient waveform, called a spike, in the cortical electroencephalogram (EEG) plays an important role during diagnosis of neurological disorders such as epilepsy. It has been suggested that artificial neural networks (ANN) can be employed for spike detection in the EEG, if suitable features are provided as input to an ANN. In this paper, we explore the performance of neural network-based classifiers using features selected by algorithms suggested by four previous investigators. Of these, three algorithms model the spike by mathematical parameters and use them as features for classification while the fourth algorithm uses raw EEG to train the classifier. The objective of this paper is to examine if there is any inherent advantage to any particular set of features, subject to the condition that the same data are used for all feature selection algorithms. Our results suggest that artificial neural networks trained with features selected using any one of the above three algorithms as well as raw EEG directly fed to the ANN will yield similar results.  相似文献   

20.
结合Gabor特征与Adaboost的人脸表情识别   总被引:15,自引:7,他引:15  
通过提取人脸图像的Gabor特征,结合Adaboost,进行人脸表情识别(FER)。针对Gabor特征维数高、冗余大的特点,引入Adaboost算法进行特征选择降低特征向量的维数。然后再以支持向量机(SVM)和最近邻分类法相结合组成分类器进行分类。该方法综合运用了Gabor特征对于人脸表情的良好表征能力、Adaboost算法的强大特征选择能力以及SVM在处理少样本、高维数问题中的优势。在JAFFE库上进行测试的结果验证了该法的有效性。从Adaboost所选择的特征集可知,在眼和嘴区域提取的特征,对于FER是最为重要的。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号