首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 171 毫秒
1.
用核空间距离聚类约简大规模SVM训练集   总被引:1,自引:0,他引:1  
针对支持向量机在大规模数据集上训练效率慢问题,本文提出了一种基于核空间距离聚类的支持向量机减样方法;首先引入核空间的距离公式,实现核空间的高维数据聚类,通过聚类约减训练集中大量非支持向量,达到减样目的,减少训练时间。实验结果表明新训练数据集算法具有更快的训练速度以及更高的分类精度。  相似文献   

2.
3.
针对支持向量数据描述(SVDD)单类分类方法运算复杂度高的缺点,提出一种启发式约减支持向量数据描述(HR-SVDD)方法。以启发的方式从原有训练集中筛选出部分样本构成约减训练集,对约减训练集进行二次规划解算,得到支持向量和决策边界。通过不同宽度系数高斯核SVDD特征的讨论,证明了HR-SVDD的有效性。人工数据集和真实数据集上的实验结果表明, HR-SVDD分类精度与传统支持向量数据描述相当,但具有更快的运算速度和更小的内存占用。  相似文献   

4.
针对支持向量机在大规模数据集上的低效率,提出了基于约减支持向量机的相关反馈图像检索算法。首先采用约减支持向量机训练初始分类器,以该分类器作为检索模型,根据检索结果进行相关反馈,从而进行再检索。实验结果表明,随着反馈次数的增加,检索到的相关图像也会增加;另外相对传统的基于向量机的方法,数据集规模越大,基于约减支持向量机的算法在时间上的优势越明显。  相似文献   

5.
针对大规模训练集的支持向量机的学习策略   总被引:29,自引:0,他引:29  
当训练集的规模很大特别是支持向量很多时.支持向量机的学习过程需要占用大量的内存,寻优速度非常缓慢,这给实际应用带来了很大的麻烦.该文提出了一种针对大规模样本集的学习策略:首先用一个小规模的样本集训练得到一个初始的分类器,然后用这个分类器对大规模训练集进行修剪,修剪后得到一个规模很小的约减集,再用这个约减集进行训练得到最终的分类器.实验表明,采用这种学习策略不仅大幅降低了学习的代价,而且这样获得的分类器的分类精度完全可以与直接通过大规模样本集训练得到的分类器的分类精度相媲美,甚至更优,同时分类速度也得到大幅提高.  相似文献   

6.
提出使用核K-means聚类算法从样本集中抽取特征向量集来训练SVM,达到减少SVM规模的目的。SVM核函数的选择会影响SVM模型的分类效果,提出将多个非线性映射能力不同的核函数进行线性组合,在特征训练集上构造出组合SVM的半定规划模型,用内点法求解出最优组合系数,得到非线性映射能力更强的半定规划SVM,并用做垃圾标签检测。在UCI数据集上与双层减样支持向量机方法进行比较,实验结果表明,新的垃圾标签检测法提高了识别率,并大幅度减少了训练时间。  相似文献   

7.
针对SVM在对大规模数据分类时求解规模过大的问题,提出了一种缩减数据集以提高训练速度的方法。该算法的第一步利用基于密度的方法大致定位能代表某个局域的质点,然后用SVM训练缩减后的数据得到一组支持向量,第二步的训练数据由支持向量以及其所代表的样本点构成。仿真实验证明该算法在保证分类准确率的情况下能有效地提高分类速度。  相似文献   

8.
《计算机工程》2017,(7):175-181
为有针对性地区分入侵攻击类别,提高入侵检测系统(IDS)整体的分类准确率,提出一种层次属性约减模型。该模型采用文化算法的双层进化思想,结合粗糙集和遗传算法进行属性约减。对数据进行预处理并分层划分子空间,形成决策子表规则集f_D。运用文化算法在信念空间进行知识更新,并将层次评价知识库的进化数据传入种群空间。在种群空间利用粗糙集和遗传算法进行进化和约减,得到各层的优选属性集f_(opt),设计出层次Bayes分类器验证模型性能。实验结果表明,该模型可将属性约减前的Bayes分类正确率提高至98.21%,并能较好地识别出流量特征不明显的R2L,U2R类别的入侵攻击。  相似文献   

9.
针对不均衡分类问题,提出了一种基于隶属度加权的模糊支持向量机模型。使用传统支持向量机对样本进行训练,并通过样本点与所得分类超平面之间的距离构造模糊隶属度,这不仅能够消除噪点和野值点的影响,而且可以在一定程度上约减样本;利用正负类的平均隶属度和样本数量求得平衡调节因子,消除数据不平衡时造成的分类超平面的偏移现象;通过实验结果验证了该算法的可行性和有效性。实验结果表明,该算法能有效提高分类精度,特别是对不平衡数据效果更加明显,在训练速度和分类性能上比传统支持向量机和模糊支持向量机有进一步的提升。  相似文献   

10.
支持向量机是在统计学习理论基础上发展起来的一种十分有效的分类方法。然而当两类样本数量相差悬殊时,会引起支持向量机分类能力的下降。为了提高支持向量机的非平衡数据分类能力,文章分析了最小二乘支持向量机的本质特征,提出了一种非平衡数据分类算法。在UCI标准数据集上进行的实验表明,该算法能够有效提高支持向量机对非均衡分布数据的正确性,尤其对于大规模训练集的情况,该算法在保证不损失训练精度的前提下,使训练速度有较大提高。  相似文献   

11.
基于KNN模型的增量学习算法   总被引:4,自引:0,他引:4  
KNN模型是公式但其属于非增量学习算法,从而限制它在一些应用领域的推广。文中提出一个基于KNN模型的增量学习算法,它通过对模型簇引进“层”的概念,对新增数据建立不同“层”的模型簇的方式对原有模型进行优化,达到增量学习的效果。实验结果验证该方法的有效性。  相似文献   

12.
XML(eXtensible Markup Language)is a standard which is widely applied in data representation and data exchange,However,as an important concept of XML,DTD(Document Type Definition)is not taken full advantage in current applications.In this paper,a new method for clustering DTDs is presented.and it can be used in XML document clustering.The two-level method clusters the elements in DTDs and clusters DTDs separately.Element clustering forms the first level and provides element clusters,which are the generalization of relevant elements.DTD clustering utilizes the generalized information and forms the second level in the whole clustering process.The two-level method has the following advantages:1) It takes into consideration both the content and the structure within DTDs;2) The generalized information about elements is more useful than the separated words in the vector model;3) The two-level method facilitates the searching of outliers.The experiments show that this method is able to categorize the relevant DTDs effectively.  相似文献   

13.
A hybrid clustering procedure for concentric and chain-like clusters   总被引:1,自引:0,他引:1  
K-means algorithm is a well known nonhierarchical method for clustering data. The most important limitations of this algorithm are that: (1) it gives final clusters on the basis of the cluster centroids or the seed points chosen initially, and (2) it is appropriate for data sets having fairly isotropic clusters. But this algorithm has the advantage of low computation and storage requirements. On the other hand, hierarchical agglomerative clustering algorithm, which can cluster nonisotropic (chain-like and concentric) clusters, requires high storage and computation requirements. This paper suggests a new method for selecting the initial seed points, so that theK-means algorithm gives the same results for any input data order. This paper also describes a hybrid clustering algorithm, based on the concepts of multilevel theory, which is nonhierarchical at the first level and hierarchical from second level onwards, to cluster data sets having (i) chain-like clusters and (ii) concentric clusters. It is observed that this hybrid clustering algorithm gives the same results as the hierarchical clustering algorithm, with less computation and storage requirements.  相似文献   

14.
A new method of partitive clustering is developed in the framework of shadowed sets. The core and exclusion regions of the generated shadowed partitions result in a reduction in computations as compared to conventional fuzzy clustering. Unlike rough clustering, here the choice of threshold parameter is fully automated. The number of clusters is optimized in terms of various validity indices. It is observed that shadowed clustering can efficiently handle overlapping among clusters as well as model uncertainty in class boundaries. The algorithm is robust in the presence of outliers. A comparative study is made with related partitive approaches. Experimental results on synthetic as well as real data sets demonstrate the superiority of the proposed approach.  相似文献   

15.
Aiming at the large-scale experts and the lower consensus in large group decision making, a novel clustering-based method integrating correlation and consensus of hesitant fuzzy linguistic information is proposed. Firstly, develop a new hesitant degree function for hesitant fuzzy linguistic element considering its scale. Secondly, put forward the correlation measure and consensus measure models combining the hesitant degree. And then present a clustering method integrating the correlation and consensus to divide the large-scale experts into several clusters. The clustering method simultaneously ensures the cohesion of clusters and the gradual increasing of the collective consensus level. After clustering, activate the selection process to update the weights of clusters combining the number of experts in clusters and the consensus level of clusters and use the score function considering the hesitant degree to rank the alternatives. Finally, a case and some comparisons are studied and analyzed to verify the rationality and effectiveness of the method.  相似文献   

16.
基于潜在语义分析和自组织特征映射神经网络(LSA—SOM),本文提出一种文本聚类方法。采用潜在语义分析的理论表示文本特征向量,以体现特征词的语义关系并实现特征向量的降维。利用SOM网络算法进行无监督自组织学习,并通过不断调节网络节点间的权向量来实现文本聚类。该方法不必预先给定聚类个数,可以在任意合适的位置生成一个新的类,克服传统方法中文本种类需要预先给定的缺点。  相似文献   

17.
A new region filtering and region weighting method, which filters out unnecessary regions from images and learns region importance from the region size and the spatial location of regions in an image, is proposed based on region representations. It weights the regions optimally and improves the performance of the region-based retrieval system based on relevance feedback. Due to the semantic gap between the low level feature representation and the high level concept in a query image, semantically relevant images may exhibit very different visual characteristics, and may be scattered in several clusters in the feature space. Our main goal is finding semantically related clusters and their weights to reduce this semantic gap. Experimental results demonstrate the efficiency and effectiveness of the proposed region filtering and weighting method in comparison with the area percentage method and region frequency weighted by inverse image frequency method, respectively.  相似文献   

18.
针对基于彩色图像的单层组织识别问题,将二重织物的组织识别问题分解为单组织识别与整合,为了提高单层组织自动识别的精度,运用颜色聚类等方法分割织物样图,并提出了一种经纱分割算法,实现了经纬纱线的准确分割,最后,提出了一个基于织物表面图像的半自动交互式纬二重组织识别方案,实验结果表明该算法是有效的。  相似文献   

19.
This paper is concerned with a stepwise mode of objective function-based fuzzy clustering. A revealed structure in data becomes refined in a successive manner by starting with the most dominant relationships and proceeding with its more detailed characterization. Technically, the proposed process develops a so-called hierarchy of clusters. Given the underlying clustering mechanism of the fuzzy C means (FCM), the produced architecture is referred to as a hierarchical FCM or hierarchical FCM tree (HFCM tree). We discuss the design of the tree demonstrating how its growth is guided by a certain mapping criterion. It is also shown how a structure at the higher level is effectively used to build clusters at the consecutive level by making use of the conditional FCM. Detailed investigations of computational complexity contrast a stepwise development of clusters with a single-step clustering completed for the equivalent number of clusters occurring in total at all final nodes of the HFCM tree. The analysis quantifies a significant reduction of the stepwise refinement of the clusters. Experimental studies include synthetic data as well as those coming from the machine learning repository.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号