首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
K-means算法的初始聚类中心是随机选取的,不同的初始中心输入会得出不同的聚类结果。针对K-means算法存在的问题,提出一种融合K-means算法与聚类的快速搜索和发现密度峰算法的聚类算法(K-CBFSAFODP)。该算法是这样考虑的:类簇中心被具有较低局部密度的邻居点包围,且与具有更高密度的任何点都有相对较大的距离,以此来刻画聚类中心;再运用K-means算法进行迭代聚类,弥补了K-means聚类中心随机选取导致容易陷入局部最优的缺点;并且引入了熵值法用来计算距离,从而实现优化聚类。在UCI数据集和人工模拟数据集上的实验表明,融合算法不仅能得到较好的聚类结果,而且聚类很稳定,同时也有较快的收敛速度,证实了该融合算法的可行性。  相似文献   

2.
针对初始聚类中心对传统K-means算法的聚类结果有较大影响的问题,提出一种依据样本点类内距离动态调整中心点类间距离的初始聚类中心选取方法,由此得到的初始聚类中心点尽可能分散且具代表性,能有效避免K-means算法陷入局部最优。通过UCI数据集上的数据对改进算法进行实验,结果表明改进的算法提高了聚类的准确性。  相似文献   

3.
针对传统K-means算法对初始聚类中心敏感的问题,提出了基于数据样本分布情况的动态选取初始聚类中心的改进K-means算法。该算法根据数据点的距离构造最小生成树,并对最小生成树进行剪枝得到K个初始数据集合,得到初始的聚类中心。由此得到的初始聚类中心非常地接近迭代聚类算法收敛的聚类中心。理论分析与实验表明,改进的K-means算法能改善算法的聚类性能,减少聚类的迭代次数,提高效率,并能得到稳定的聚类结果,取得较高的分类准确率。  相似文献   

4.
王娟 《微型机与应用》2011,30(20):71-73,76
传统K-means算法对初始聚类中心的选取和样本的输入顺序非常敏感,容易陷入局部最优。针对上述问题,提出了一种基于遗传算法的K-means聚类算法GKA,将K-means算法的局部寻优能力与遗传算法的全局寻优能力相结合,通过多次选择、交叉、变异的遗传操作,最终得到最优的聚类数和初始质心集,克服了传统K-means算法的局部性和对初始聚类中心的敏感性。  相似文献   

5.
基于数据分段的K-means的优化研究   总被引:1,自引:0,他引:1  
K-means聚类算法是一种主流的迭代下降聚类算法,收敛于局部最优化状态.由于K-means随机选取k个初始聚类中心,使得聚类结果的有效性随初始输入而波动,为此文中采取一种预处理的方式来选取初始聚类中心.首先在某种范数的意义下,确定相隔最远的两个数据点之间的距离,然后采用数据分段的方法,将数据集分成k段,在每段中选取一个中心,以此来减小聚类结果随初始输入的波动.实验显示优化后的K-means有效地消除了初始输入的影响,并显著地减少了算法迭代次数和聚类误差.  相似文献   

6.
在目前聚类方法中, k-means与势函数是最常用的算法,虽然两种算法有很多优点,但也存在自身的局限性。 k-means聚类算法:其聚类数目无法确定,需要提前进行预估,同时对初始聚类中心敏感,且容易受到异常点干扰;势函数聚类算法:其聚类区间范围有限,对多维数据进行聚类其效率低。针对以上两种算法的缺点,提出了一种基于 K-means 与势函数法的改进聚类算法。它首先采用势函数法确定聚类数目与初始中心,然后利用K-means法进行聚类,该改进算法具有势函数法“盲”特性及K-means法高效性的优点。实验对改进算法的有效性进行了验证,结果表明,改进算法在聚类精度及收敛速度方面有很大提高。  相似文献   

7.
优化初始聚类中心的K-means聚类算法   总被引:1,自引:0,他引:1       下载免费PDF全文
针对传统K-means算法对初始中心十分敏感,聚类结果不稳定问题,提出了一种改进K-means聚类算法。该算法首先计算样本间的距离,根据样本距离找出距离最近的两点形成集合,根据点与集合的计算公式找出其他所有离集合最近的点,直到集合内数据数目大于或等于[α]([α]为样本集数据点数目与聚类的簇类数目的比值),再把该集合从样本集中删除,重复以上步骤得到K(K为簇类数目)个集合,计算每个集合的均值作为初始中心,并根据K-means算法得到最终的聚类结果。在Wine、Hayes-Roth、Iris、Tae、Heart-stalog、Ionosphere、Haberman数据集中,改进算法比传统K-means、K-means++算法的聚类结果更稳定;在Wine、Iris、Tae数据集中,比最小方差优化初始聚类中心的K-means算法聚类准确率更高,且在7组数据集中改进算法得到的轮廓系数和F1值最大。对于密度差异较大数据集,聚类结果比传统K-means、K-means++算法更稳定,更准确,且比最小方差优化初始聚类中心的K-means算法更高效。  相似文献   

8.
针对传统K-means算法随机选取初始聚类中心,易造成准则函数收敛速度慢、聚类结果陷入局部最优等问题,提出一种基于网格和图论的初始聚类中心确定算法。该算法将数据空间网格化,通过在网格单元上形成树的连通分支来选取初始中心点。采用模拟和真实数据集对该算法选取的初始中心进行测试,实验结果表明,改进后的K-means算法在降低时间复杂度、减少迭代次数以及提高聚类精度方面都取到了较好的效果。  相似文献   

9.
传统K-means算法随机选取初始聚类中心,容易导致聚类结果不稳定,而优化初始聚类中心的K-means算法需要一定的参数选择,也会使聚类结果缺乏客观性。为此,根据样本空间分布紧密度信息,提出利用最小方差优化初始聚类中心的K-means算法。该算法运用样本空间分布信息,通过计算样本空间分布的方差得到样本紧密度信息,选择方差最小(即紧密度最高)且相距一定距离的样本作为初始聚类中心,实现优化的K-means聚类。在UCI机器学习数据库数据集和含有噪音的人工模拟数据集上的实验结果表明,该算法不仅能得到较好的聚类结果,且聚类结果稳定,对噪音具有较强的免疫性能。  相似文献   

10.
用K-means算法量化彩色图象能够取得很好的视觉效果,但由于初始聚类中心选取的任意性,导致迭代次数过多,运行时间过长.本文提出的色彩量化算法在吸取K-means算法的迭代思想的基础上,借鉴统计学原理,选取出现频率最高且在色彩空间相互之间距离大于某一阈值的一组颜色作为初始聚类中心.这样既保留了输入图象的主颜色,又尽可能多地表达更加丰富的颜色.实验表明在有效保证量化后图象的质量的同时,该算法能使运行效率得到明显地改进.  相似文献   

11.
While reducing the dimensionality of a corpus, concept decomposition (CD) based on fuzzy K-means (FKM) clustering provides better approximation than CD based on spherical k-means clustering. However, performance of the FKM algorithm is limited by its distance metric and it is proved that assignment of feature weights can improve the performance of FKM. Our work builds upon this analysis and proposes two approaches to feature weight selection. Using four testing document collections, we demonstrate that the CD based on the proposed feature-weighted FKM provides better approximation than the CD based on FKM while maintaining the quality of retrieval.  相似文献   

12.
Factorial K-means analysis (FKM) and Reduced K-means analysis (RKM) are clustering methods that aim at simultaneously achieving a clustering of the objects and a dimension reduction of the variables. Because a comprehensive comparison between FKM and RKM is lacking in the literature so far, a theoretical and simulation-based comparison between FKM and RKM is provided. It is shown theoretically how FKM’s versus RKM’s performances are affected by the presence of residuals within the clustering subspace and/or within its orthocomplement in the observed data. The simulation study confirmed that for both FKM and RKM, the cluster membership recovery generally deteriorates with increasing amount of overlap between clusters. Furthermore, the conjectures were confirmed that for FKM the subspace recovery deteriorates with increasing relative sizes of subspace residuals compared to the complement residuals, and that the reverse holds for RKM. As such, FKM and RKM complement each other. When the majority of the variables reflect the clustering structure, and/or standardized variables are being analyzed, RKM can be expected to perform reasonably well. However, because both RKM and FKM may suffer from subspace and membership recovery problems, it is essential to critically evaluate their solutions on the basis of the content of the clustering problem at hand.  相似文献   

13.
Multimodal decision-level fusion for person authentication   总被引:1,自引:0,他引:1  
The use of clustering algorithms for decision-level data fusion is proposed. Person authentication results coming from several modalities (e.g., still image, speech), are combined by using fuzzy k-means (FKM) and fuzzy vector quantization (FVQ) algorithms, and a median radial basis function (MRBF) network. The quality measure of the modalities data is used for fuzzification. Two modifications of the FKM and FVQ algorithms, based on a fuzzy vector distance definition, are proposed to handle the fuzzy data and utilize the quality measure. Simulations show that fuzzy clustering algorithms have better performance compared to the classical clustering algorithms and other known fusion algorithms. MRBF has better performance especially when two modalities are combined. Moreover, the use of the quality via the proposed modified algorithms increases the performance of the fusion system  相似文献   

14.
疲劳寿命预测是橡胶元件设计的核心技术之一.基于Abaqus和S-N技术以及FKM标准成功实现橡胶悬架弹性关节疲劳寿命的预测.该预测方法也为类似橡胶弹性元件的疲劳寿命评估提供一种尝试和设计思路.  相似文献   

15.
Due to the continuous release of new products, manufacturers are paying attention to customer-oriented design of products that meet user needs to minimize the risk of their products being rejected by the market. Due to the ambiguity of user cognition, it is difficult to accurately obtain the user's preference for individual productions. To respond to the challenge, we propose an engineering scientific research method of interactive genetic algorithm with the interval arithmetic based on hesitation and fuzzy kano model(FKM) to explore the emotional needs of users for product forms and drive product modeling evolution design. Through expert interviews, the morphological characteristics and perceptual images factors of the products attracting users are investigated. In order to identify the user's satisfaction relationship with the perceptual images, we use FKM to analyze the product image style that meets the user's kansei needs accurately and selects 5 factors which is attractive attributes. Meanwhile, we attempt to transform this 5 factors into evaluation carrier to guide the evolution direction of product styling in HIIF-IGA, and then optimized four electric bikes with scores over 8.8 so that it could realize user demand-driven product evolution design. To handle users' ambiguity, the FAHP method is used to quantify the user's emotional imagery criterion and create a product evolution design system platform, which can automatically generate product styling design scheme in line with user preferences. This experimental results show that the proposed method can help enterprises effectively improve customer satisfaction and reduce the cost and time of product development.  相似文献   

16.
针对复杂图像背景及光照导致的肤色检测率不高的问题,提出一种基于分裂式K均值聚类的椭圆模型肤色检测方法。该方法对图像进行光线补偿处理,采用Gray World方法对图像进行颜色均衡,选择建立检测效率较高的椭圆肤色模型进行肤色检测,并在检测出的肤色区域上采用分裂式K均值聚类(FKM)进行二次的肤色判决,进一步准确检测出肤色区域。实验表明,所提出的检测算法能准确高效地检测出肤色区域,具有较高地准确率和较强的鲁棒性。  相似文献   

17.
基于分裂式K均值聚类的图像分割方法   总被引:1,自引:0,他引:1  
张健  宋刚 《计算机应用》2011,31(2):372-374
模糊C均值聚类(FCM)算法是一种有效的无监督图像分割方法,适用于任意分类数,不需要预知图像特征,但其聚类效果直接受待分类样本噪声和分类初始条件的影响。因此,提出了一种适用于彩色图像分割的分裂式K均值聚类(FKM)算法,该算法首先使用中值滤波对分类样本去噪,然后使用一种分裂聚类法对图像样本进行预分类,得到一组样本集初始划分,最后以这组划分为起点,使用基于概率距离的K均值聚类对图像分割进行迭代优化。实验结果表明,该算法可以避免FCM的误分类,诸如陷于中心死区、中心重叠和局部极小值,而且提高了分割速度。  相似文献   

18.
Malignant and benign types of tumor infiltrated in human brain are diagnosed with the help of an MRI scanner. With the slice images obtained using an MRI scanner, certain image processing techniques are utilized to have a clear anatomy of brain tissues. One such image processing technique is hybrid self-organizing map (SOM) with fuzzy K means (FKM) algorithm, which offers successful identification of tumor and good segmentation of tissue regions present inside the tissues of brain. The proposed algorithm is efficient in terms of Jaccard Index, Dice Overlap Index (DOI), sensitivity, specificity, peak signal to noise ratio (PSNR), mean square error (MSE), computational time and memory requirement. The algorithm proposed through this paper has better data handling capacities and it also performs efficient processing upon the input magnetic resonance (MR) brain images. Automatic detection of tumor region in MR (magnetic resonance) brain images has a high impact in helping the radio surgeons assess the size of the tumor present inside the tissues of brain and it also supports in identifying the exact topographical location of tumor region. The proposed hybrid SOM-FKM algorithm assists the radio surgeon by providing an automated tissue segmentation and tumor identification, thus enhancing radio therapeutic procedures. The efficiency of the proposed technique is verified using the clinical images obtained from four patients, along with the images taken from Harvard Brain Repository.  相似文献   

19.
In the irrigated regions of New South Wales and Victoria, Australia, secondary soil salinisation is becoming of increasing concern. However, natural resource data are not available to elucidate the threat. What is required is information on stratigraphy and spatial location and quality (i.e. low, intermediate and high salinity) of groundwater. Electromagnetic (EM) induction instruments have been used successfully to obtain this information because they measure bulk electrical conductivity (ECa), which is a function of clay content, mineralogy, salinity and moisture. In this paper, we show how data collected from an EM survey can be used to infer this information in the cotton growing area of the lower Namoi valley, Australia. The survey involved taking EM34 measurements at three coil spacings in the horizontal mode of operation (i.e. 10, 20 and 40 m). In all 1869 locations were visited on an approximate 1-km grid. In order to objectively classify the ECa data into natural resource management units, we used fuzzy k-means (FKM). The classes obtained were subsequently mapped using a method that ensured summation of class membership values to unity and using local ordinary kriging. The use of a confusion index highlighted areas where the collection of additional information may be appropriate. Using fuzzy linear discriminant analysis we found that measurements obtained at the 10 m coil spacing reflect the shallow stratigraphy and physiography, whilst the 40 m coil spacing clearly differentiated parts of the clay plain underlain by saline aquifers. We conclude that the use of EM34 data and fuzzy k-means provide a good and non-destructive approach to representing the lower Namoi valley landscape.  相似文献   

20.
时空一致性是分布交互式仿真的关键问题,联网仿真的计算机之间的的时间同步必须达到一定的精度,而计算机的时钟分辨率是制约仿真系统时间同步精度的重要因素。在Windnws2000/NT操作系统下PC机系统时钟的分辨率较低,且时间漂移率较高,因此在使用PC机和Windows2000/NT操作系统的分布交互式仿真系统中,如果使用系统时钟,时间同步精度不可能很高,如果为每台PC机加装高性能外部时钟的话,不但会增加系统成本,而且也不利于系统扩展。该文提出了一种基于PC机性能计数器的高分辨率、低时间漂移时钟的设计方案,解决了分布交互式仿真系统基于PC机自身资源获取高性能时钟的问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号