首页 | 本学科首页   官方微博 | 高级检索  
 共查询到18条相似文献,搜索用时 171 毫秒
针对传统支持向量机在封装式特征选择中分类效果差、子集选取冗余、计算性能易受核函数参数影响的不足,利用元启发式优化算法对其进行同步优化.首先利用莱维飞行策略和模拟退火机制对秃鹰搜索算法的局部搜索能力与勘探利用解空间能力进行改进,通过标准函数的测试结果验证其改进的有效性;其次将支持向量机核函数参数作为待优化目标,利用改进后的算法在封装式特征选择模型中搜寻最优核函数参数,同时获得相对应的最优特征子集;最后对UCI存储库的12个标准数据集进行特征选择仿真实验,在平均分类准确率、所选特征个数及适应度值上进行综合评估分析.实验结果表明,所提算法可有效降低特征维度,能够更准确地实现数据分类,在空间搜索与求解精度方面较原算法及其他非线性最优化算法表现优秀,具有一定的工程应用价值.  相似文献   

一种新的特征提取方法及其在模式识别中的应用   总被引:2,自引:0,他引:2  
刘宗礼  曹洁  郝元宏 《计算机应用》2009,29(4):1032-1035
核典型相关分析(KCCA)是一种有监督的机器学习方法,可以有效地提取非线性特征。然而随着训练样本数目的增加,标准的KCCA方法的计算复杂度会随之增加。针对此缺点,提出一种改进的KCCA方法:首先用几何特征选择方法选择一个训练样本子集并将其映射到再生核希尔伯特空间(RKHS),然后设计了一种提升特征提取效率的算法,该算法按照对特征分类贡献的大小巧妙地选取样本的特征值,进而求出其相应的特征向量,最后将改进的KCCA与支持向量数据描述(SVDD)多分类器相结合用于分类识别。在ORL人脸图像数据库上的实验结果表明,改进的方法相对传统的KCCA方法,在不影响识别率的情况下提高了人脸识别速度,减小了系统存储量。  相似文献   

张乐园  李佳烨  李鹏清 《计算机应用》2018,38(12):3444-3449
针对高维的数据中往往存在非线性、低秩形式和属性冗余等问题,提出一种基于核函数的属性自表达无监督属性选择算法——低秩约束的非线性属性选择算法(LRNFS)。首先,将每一维的属性映射到高维的核空间上,通过核空间上的线性属性选择去实现低维空间上的非线性属性选择;然后,对自表达形式引入偏差项并对系数矩阵进行低秩与稀疏处理;最后,引入核矩阵的系数向量的稀疏正则化因子来实现属性选择。所提算法中用核矩阵来体现其非线性关系,低秩考虑数据的全局信息进行子空间学习,自表达形式确定属性的重要程度。实验结果表明,相比于基于重新调整的线性平方回归(RLSR)半监督特征选择算法,所提算法进行属性选择之后作分类的准确率提升了2.34%。所提算法解决了数据在低维特征空间上线性不可分的问题,提升了属性选择的准确率。  相似文献   

基于黎曼流形稀疏编码的图像检索算法   总被引:1,自引:0,他引:1  
针对视觉词袋(Bag-of-visual-words,BOVW)模型直方图量化误差大的缺点,提出基于稀疏编码的图像检索算法.由于大多数图像特征属于非线性流形结构,传统稀疏编码使用向量空间对其度量必然导致不准确的稀疏表示.考虑到图像特征空间的流形结构,选择对称正定矩阵作为特征描述子,构建黎曼流形空间.利用核技术将黎曼流形结构映射到再生核希尔伯特空间,非线性流形转换为线性稀疏编码,获得图像更准确的稀疏表示.实验在Corel1000和Caltech101两个数据集上进行,与已有的图像检索算法对比,提出的图像检索算法不仅提高了检索准确率,而且获得了更好的检索性能.  相似文献   

提出了一种基于遗传算法的大数据特征选择算法。该算法首先对各维度的特征进行评估,根据每个特征在同类最近邻和异类最近邻上的差异度调整其权重,基于特征权重引导遗传算法的搜索,以提升算法的搜索能力和获取特征的准确性;然后结合特征权重计算特征的适应度,以适应度作为评价指标,启动遗传算法获取最优的特征子集,并最终实现高效准确的大数据特征选择。通过实验分析发现,该算法能够有效减小分类特征数,并提升特征分类准确率。  相似文献   

基于遗传算法的核函数可调SOM方法   总被引:1,自引:0,他引:1  
自组织映射(SOM)算法是一种无导师学习方法,当学习样本分布呈多态形式,具有高度非线性时,该算法显示出较差的鲁棒性和可靠性.基于核函数的学习是通过核函数实现一个从低维输入空间到高维特征空间的映射,从而使输入空间中复杂的样本结构在特征空间中变得简单.但是针对不同的数据集,各种核函数的分类效果不同,所以核函数选择是问题依赖的.采用核函数可调的方法,基于SOM网络结构,通过学习,采用遗传算法(GA)调整系数,能得到比单个核函数分类效果更好的结果.  相似文献   

标准支持向量机结合封装式特征选择具有冗余特征多、分类准确率低的不足,为此,提出基于改进哈里斯鹰算法的特征选择同步优化策略。为改进特征子集选取能力和支持向量机的分类准确率,利用混沌映射、能量因子非线性调整和小孔成像对立学习对哈里斯鹰算法进行改进,将改进哈里斯鹰算法用于SVM参数调整和特征子集选取同步优化问题。实验结果表明,改进算法能够在降低特征维度的情况下实现更高的分类准确率,实现同步优化效果。  相似文献   

为了解决高维数据在分类时导致的维数灾难,降维是数据预处理阶段的主要步骤。基于稀疏学习进行特征选择是目前的研究热点。针对现实中大量非线性可分问题,借助核技巧,将非线性可分的数据样本映射到核空间,以解决特征的非线性相似问题。进一步对核空间的数据样本进行稀疏重构,得到原数据在核空间的一种简洁的稀疏表达方式,然后构建相应的评分机制选择最优子集。受益于稀疏学习的自然判别能力,该算法能够选择出保持原始数据结构特性的"好"特征,从而降低学习模型的计算复杂度并提升分类精度。在标准UCI数据集上的实验结果表明,其性能上与同类算法相比平均可提高约5%。  相似文献   

为解决高维数据在分类时造成的“维数灾难”问题,提出一种新的将核函数与稀疏学习相结合的属性选择算法。具体地,首先将每一维属性利用核函数映射到核空间,在此高维核空间上执行线性属性选择,从而实现低维空间上的非线性属性选择;其次,对映射到核空间上的属性进行稀疏重构,得到原始数据集的一种稀疏表达方式;接着利用L 1范数构建属性评分选择机制,选出最优属性子集;最后,将属性选择后的数据用于分类实验。在公开数据集上的实验结果表明,该算法能够较好地实现属性选择,与对比算法相比分类准确率提高了约3%。  相似文献   

针对不平衡数据分类问题,一种基于密度的近邻分类算法(DNN)被提出。它利用核密度估计敏锐地捕捉不平衡数据的局部分布特征,由此产生更好的分类结果。用核密度估计方法估计查询实例的各类别密度,以此对其进行密度定位;将原始数据空间中的点映射到由类别密度和距离信息构成的空间;在这个映射空间中动态地选择近邻并对查询实例进行分类。实验结果表明,DNN算法在15个不平衡数据集上分类性能良好。  相似文献   

基于最大互信息最大相关熵的特征选择方法   总被引:5,自引:1,他引:4  
特征选择算法主要分为filter和wrapper两大类,并已提出基于不同理论的算法模型,但依然存在算法处理能力不强、子集分类精度不高等问题。基于模糊粗糙集的信息熵模型提出最大互信息最大相关熵标准,并根据该标准设计了一种新的特征选择方法,能同时处理离散数据、连续数据和模糊数据等混合信息。经UCI数据集试验,表明该算法与其他算法相比,具有较高的精度,且稳定性较高,是有效的。  相似文献   

Most of the widely used pattern classification algorithms, such as Support Vector Machines (SVM), are sensitive to the presence of irrelevant or redundant features in the training data. Automatic feature selection algorithms aim at selecting a subset of features present in a given dataset so that the achieved accuracy of the following classifier can be maximized. Feature selection algorithms are generally categorized into two broad categories: algorithms that do not take the following classifier into account (the filter approaches), and algorithms that evaluate the following classifier for each considered feature subset (the wrapper approaches). Filter approaches are typically faster, but wrapper approaches deliver a higher performance. In this paper, we present the algorithm – Predictive Forward Selection – based on the widely used wrapper approach forward selection. Using ideas from meta-learning, the number of required evaluations of the target classifier is reduced by using experience knowledge gained during past feature selection runs on other datasets. We have evaluated our approach on 59 real-world datasets with a focus on SVM as the target classifier. We present comparisons with state-of-the-art wrapper and filter approaches as well as one embedded method for SVM according to accuracy and run-time. The results show that the presented method reaches the accuracy of traditional wrapper approaches requiring significantly less evaluations of the target algorithm. Moreover, our method achieves statistically significant better results than the filter approaches as well as the embedded method.  相似文献   

基于支持向量机核函数的条件,将Sobolev Hilbert空间的再生核函数和多项式核函数进行有效的线性组合,给出一种新的支持向量机的组合核函数,提出一种基于再生核的组合核函数支持向量机的模式分析方法,该方法兼具了全局核函数与局部核函数的优点,且算法的复杂度被降低。仿真实验结果表明:支持向量机的核函数采用基于再生核的组合核函数是可行的,且此核函数不仅具有核函数的非线性映射特征,而且也继承了核函数对非线性逐级精细逼近的特征,模式分析的效果比单核函数可以更加细腻。  相似文献   

Feature selection is a process aimed at filtering out unrepresentative features from a given dataset, usually allowing the later data mining and analysis steps to produce better results. However, different feature selection algorithms use different criteria to select representative features, making it difficult to find the best algorithm for different domain datasets. The limitations of single feature selection methods can be overcome by the application of ensemble methods, combining multiple feature selection results. In the literature, feature selection algorithms are classified as filter, wrapper, or embedded techniques. However, to the best of our knowledge, there has been no study focusing on combining these three types of techniques to produce ensemble feature selection. Therefore, the aim here is to answer the question as to which combination of different types of feature selection algorithms offers the best performance for different types of medical data including categorical, numerical, and mixed data types. The experimental results show that a combination of filter (i.e., principal component analysis) and wrapper (i.e., genetic algorithms) techniques by the union method is a better choice, providing relatively high classification accuracy and a reasonably good feature reduction rate.  相似文献   

This correspondence presents a novel hybrid wrapper and filter feature selection algorithm for a classification problem using a memetic framework. It incorporates a filter ranking method in the traditional genetic algorithm to improve classification performance and accelerate the search in identifying the core feature subsets. Particularly, the method adds or deletes a feature from a candidate feature subset based on the univariate feature ranking information. This empirical study on commonly used data sets from the University of California, Irvine repository and microarray data sets shows that the proposed method outperforms existing methods in terms of classification accuracy, number of selected features, and computational efficiency. Furthermore, we investigate several major issues of memetic algorithm (MA) to identify a good balance between local search and genetic search so as to maximize search quality and efficiency in the hybrid filter and wrapper MA  相似文献   

The kernel function method in support vector machine (SVM) is an excellent tool for nonlinear classification. How to design a kernel function is difficult for an SVM nonlinear classification problem, even for the polynomial kernel function. In this paper, we propose a new kind of polynomial kernel functions, called semi-tensor product kernel (STP-kernel), for an SVM nonlinear classification problem by semi-tensor product of matrix (STP) theory. We have shown the existence of the STP-kernel function and verified that it is just a polynomial kernel. In addition, we have shown the existence of the reproducing kernel Hilbert space (RKHS) associated with the STP-kernel function. Compared to the existing methods, it is much easier to construct the nonlinear feature mapping for an SVM nonlinear classification problem via an STP operator.  相似文献   

This correspondence presents a novel hybrid wrapper and filter feature selection algorithm for a classification problem using a memetic framework. It incorporates a filter ranking method in the traditional genetic algorithm to improve classification performance and accelerate the search in identifying the core feature subsets. Particularly, the method adds or deletes a feature from a candidate feature subset based on the univariate feature ranking information. This empirical study on commonly used data sets from the University of California, Irvine repository and microarray data sets shows that the proposed method outperforms existing methods in terms of classification accuracy, number of selected features, and computational efficiency. Furthermore, we investigate several major issues of memetic algorithm (MA) to identify a good balance between local search and genetic search so as to maximize search quality and efficiency in the hybrid filter and wrapper MA.  相似文献   

Clustering Incomplete Data Using Kernel-Based Fuzzy C-means Algorithm   总被引:3,自引:0,他引:3  

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号