首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
一个新的模糊聚类有效性指标   总被引:3,自引:1,他引:2       下载免费PDF全文
孔攀  邓辉文  黄艳艳  江欢 《计算机工程》2009,35(12):143-144
提出一个新的模糊聚类有效性指标。该指标能确定由模糊C-均值算法(FCM)所得模糊划分的最优划分和最优聚类数,结合了模糊聚类的紧致性和分离性信息,用类内加权平方误差和计算紧致性,用类间相似度计算分离性。在3个人造数据集和3个真实数据集上进行对比实验,结果证明该指标的性能优于其他有效性指标。  相似文献   

2.
为了解决传统聚类由于缺少有效指导而导致图像分割结果不理想的问题,将半监督方法引入到多目标进化模糊聚类算法中,提出了一种基于半监督的多目标进化模糊聚类。图像分割算法通过构造基于半监督的类内紧致性函数和类间分离度函数,利用监督信息指导聚类过程获得非支配解集。为了从非支配解集中选择一个最优解,利用监督信息构造了基于相似性度量的有效性指标。实验结果表明,提出的方法在分割准确率和视觉效果上明显优于无监督的聚类方法。  相似文献   

3.
模糊聚类是模式识别、机器学习和图像处理等领域的重要研究内容。模糊C-均值聚类算法是最常用的模糊聚类实现算法,该算法需要预先给定聚类数才能对数据集进行聚类。提出了一种新的聚类有效性指标,对聚类结果进行有效性验证。该指标从划分熵、隶属度、几何结构角度,定义了紧凑度、分离度、重叠度三个重要特征测量。在此基础上,提出了一种最佳聚类数确定方法。将新聚类有效性指标和传统有效性指标在6个人工数据集和3个真实数据集进行实验验证。实验结果表明,所提出的指标和方法能够有效地对聚类结果进行评估,适合确定样本的最佳聚类数。  相似文献   

4.
应用核函数度量的紧致性和分离性,给出了一种新的聚类有效性指标KKW,由KKW指标得到最优聚类数并用于修正核函数模糊聚类算法(MKFCM),由于经过了修正核函数的映射,使原来没有显现的特征突显出来。用MKFCM对Wine和glass数据集进行聚类,每一类的聚类正确度大于90%;对于缺失数据的Wisconsin Breast Cancer 数据,错分率为4.72%。该聚类方法在性能上比经典聚类算法有所改进,具有更快的收敛速度以及较高的准确度。仿真实验的结果证实了修正核聚类方法的可行性和有效性。  相似文献   

5.
结合模糊聚类的类内紧致性和类间分离性信息,提出一种新的模糊聚类有效性指标。该指标能够确定由模糊C-均值算法(FCM)所得模糊划分的最优划分和最佳聚类数。在1个人造数据集和4个真实数据集上进行对比实验,结果表明该指标性能的优越性。  相似文献   

6.
一种新的聚类有效性函数   总被引:2,自引:1,他引:2       下载免费PDF全文
聚类有效性函数是用于评价聚类结果优劣的指标,准确地给出初始聚类类别数将使得聚类结果趋于合理化。根据模糊不确定性理论及聚类问题的基本特性,引入了新的紧密度度量指标DiU;c),在此基础上提出了一个旨在寻求最优聚类类别数的有效性函数。该函数基于数据集的紧密度与分离度特征,综合考虑了数据成员的隶属度及数据集的几何结构。实验结果表明该有效性函数能够发现最优的聚类类别数,对于分类结构较为明确的数据集表现出良好的性能,并且对于权重系数具有良好的鲁棒性。  相似文献   

7.
针对模糊C均值(FCM)算法聚类数需要预先设定的问题,提出了一种新的模糊聚类有效性指标。首先,计算簇中每个属性的方差,给方差较小的属性赋予较大的权值,给方差较大的属性赋予较小的权值,得到一种基于属性加权的FCM算法;然后,根据FCM改进算法得到的隶属度矩阵计算类内紧致性和类间分离性;最后,利用类内紧致性和类间分离性定义一个新的聚类有效性指标。实验结果表明,该指标可以找到符合数据自然分布的类的数目。基于属性加权的FCM算法可以识别不同属性的重要程度,增加聚类结果的准确率,使用FCM改进算法得到的隶属度矩阵定义的有效性指标,能够发现正确的聚类个数,实现聚类无监督的学习过程。  相似文献   

8.
模糊C均值( FCM)聚类算法最终形成的聚类质量会受到初始值的设定、簇的个数选定及参数选择等多方面因素的影响。文中对最近发表的5种代表性聚类有效性指数在不同的数据维数、聚类个数和参数等条件下对FCM的聚类有效性评价结果进行对比分析。实验结果表明基于类内紧致度和类间离散度比值的聚类有效性指数对数据维度及噪声较为鲁棒,基于隶属度的聚类有效性指数不适于高维数据等,上述结果可帮助研究人员在不同的应用环境下选择合适的模糊聚类有效性函数。  相似文献   

9.
针对传统的模糊C-均值聚类算法对初始聚类中心较敏感、易陷入局部最优的缺点,将粒子群优化算法和FCM算法相结合,提出一种改进的模糊聚类算法。该算法利用粒子群算法的全局搜索能力代替FCM算法寻找初始聚类中心,使其跳出局部最优,实现模糊聚类。主要从反映数据集分类的类内紧致性程度和类间分离性程度的角度考虑,重新设计适应度函数。实验结果表明,提出的算法在聚类正确率和有效性指标上有更好的效果。  相似文献   

10.
提出了一种基于对偶树复小波变换的模糊纹理图像分割算法,该方法包括纹理特征提取和纹理分类两个阶段,其中,特征提取在对偶树复小波变换的基础上进行;纹理分类可以直接用模糊C均值算法进行聚类从而完成纹理的分割,但由于该算法中隶属度函数是基于样本到类中心的距离设计的,这对非球形分布数据很不合理,针对该问题,引入样本与样本的紧致度来度量类中各个样本之间的关系从而修正隶属度函数,并将其用于纹理分类。实验结果表明与模糊C均值算法在运行时间上相差不大的情况下,改进的方法在分割精度、边缘准确性和区域一致性上都得到了明显的改善。  相似文献   

11.
Cluster validity indices are used for estimating the quality of partitions produced by clustering algorithms and for determining the number of clusters in data. Cluster validation is difficult task, because for the same data set more partitions exists regarding the level of details that fit natural groupings of a given data set. Even though several cluster validity indices exist, they are inefficient when clusters widely differ in density or size. We propose a clustering validity index that addresses these issues. It is based on compactness and overlap measures. The overlap measure, which indicates the degree of overlap between fuzzy clusters, is obtained by calculating the overlap rate of all data objects that belong strongly enough to two or more clusters. The compactness measure, which indicates the degree of similarity of data objects in a cluster, is calculated from membership values of data objects that are strongly enough associated to one cluster. We propose ratio and summation type of index using the same compactness and overlap measures. The maximal value of index denotes the optimal fuzzy partition that is expected to have a high compactness and a low degree of overlap among clusters. Testing many well-known previously formulated and proposed indices on well-known data sets showed the superior reliability and effectiveness of the proposed index in comparison to other indices especially when evaluating partitions with clusters that widely differ in size or density.  相似文献   

12.
This article describes a multiobjective spatial fuzzy clustering algorithm for image segmentation. To obtain satisfactory segmentation performance for noisy images, the proposed method introduces the non-local spatial information derived from the image into fitness functions which respectively consider the global fuzzy compactness and fuzzy separation among the clusters. After producing the set of non-dominated solutions, the final clustering solution is chosen by a cluster validity index utilizing the non-local spatial information. Moreover, to automatically evolve the number of clusters in the proposed method, a real-coded variable string length technique is used to encode the cluster centers in the chromosomes. The proposed method is applied to synthetic and real images contaminated by noise and compared with k-means, fuzzy c-means, two fuzzy c-means clustering algorithms with spatial information and a multiobjective variable string length genetic fuzzy clustering algorithm. The experimental results show that the proposed method behaves well in evolving the number of clusters and obtaining satisfactory performance on noisy image segmentation.  相似文献   

13.
Cluster validation is a major issue in cluster analysis of data mining, which is the process of evaluating performance of clustering algorithms under varying input conditions. Many existing validity indices address clustering results of low-dimensional data. Within high-dimensional data, many of the dimensions are irrelevant, and the clusters usually only exist in some projected subspaces spanned by different combinations of dimensions. This paper presents a solution to the problem of cluster validation for projective clustering. We propose two new measurements for the intracluster compactness and intercluster separation of projected clusters. Based on these measurements and the conventional indices, three new cluster validity indices are presented. Combined with a fuzzy projective clustering algorithm, the new indices are used to determine the number of projected clusters in high-dimensional data. The suitability of our proposal has been demonstrated through an empirical study using synthetic and real-world datasets.  相似文献   

14.
Cluster validity indexes are very important tools designed for two purposes: comparing the performance of clustering algorithms and determining the number of clusters that best fits the data. These indexes are in general constructed by combining a measure of compactness and a measure of separation. A classical measure of compactness is the variance. As for separation, the distance between cluster centers is used. However, such a distance does not always reflect the quality of the partition between clusters and sometimes gives misleading results. In this paper, we propose a new cluster validity index for which Jeffrey divergence is used to measure separation between clusters. Experimental results are conducted using different types of data and comparison with widely used cluster validity indexes demonstrates the outperformance of the proposed index.  相似文献   

15.
Cluster validity indices are used to validate results of clustering and to find a set of clusters that best fits natural partitions for given data set. Most of the previous validity indices have been considerably dependent on the number of data objects in clusters, on cluster centroids and on average values. They have a tendency to ignore small clusters and clusters with low density. Two cluster validity indices are proposed for efficient validation of partitions containing clusters that widely differ in sizes and densities. The first proposed index exploits a compactness measure and a separation measure, and the second index is based an overlap measure and a separation measure. The compactness and the overlap measures are calculated from few data objects of a cluster while the separation measure uses all data objects. The compactness measure is calculated only from data objects of a cluster that are far enough away from the cluster centroids, while the overlap measure is calculated from data objects that are enough near to one or more other clusters. A good partition is expected to have low degree of overlap and a larger separation distance and compactness. The maximum value of the ratio of compactness to separation and the minimum value of the ratio of overlap to separation indicate the optimal partition. Testing of both proposed indices on some artificial and three well-known real data sets showed the effectiveness and reliability of the proposed indices.  相似文献   

16.
In this paper, an approach for automatically clustering a data set into a number of fuzzy partitions with a simulated annealing using a reversible jump Markov chain Monte Carlo algorithm is proposed. This is in contrast to the widely used fuzzy clustering scheme, the fuzzy c-means (FCM) algorithm, which requires the a priori knowledge of the number of clusters. The said approach performs the clustering by optimizing a cluster validity index, the Xie-Beni index. It makes use of the homogeneous reversible jump Markov chain Monte Carlo (RJMCMC) kernel as the proposal so that the algorithm is able to jump between different dimensions, i.e., number of clusters, until the correct value is obtained. Different moves, like birth, death, split, merge, and update, are used for sampling a candidate state given the current state. The effectiveness of the proposed technique in optimizing the Xie-Beni index and thereby determining the appropriate clustering is demonstrated for both artificial and real-life data sets. In a part of the investigation, the utility of the fuzzy clustering scheme for classifying pixels in an IRS satellite image of Kolkata is studied. A technique for reducing the computation efforts in the case of satellite image data is incorporated.  相似文献   

17.
Novel Cluster Validity Index for FCM Algorithm   总被引:5,自引:0,他引:5       下载免费PDF全文
How to determine an appropriate number of clusters is very important when implementing a specific clustering algorithm, like c-means, fuzzy c-means (FCM). In the literature, most cluster validity indices are originated from partition or geometrical property of the data set. In this paper, the authors developed a novel cluster validity index for FCM, based on the optimality test of FCM. Unlike the previous cluster validity indices, this novel cluster validity index is inherent in FCM itself. Comparison experiments show that the stability index can be used as cluster validity index for the fuzzy c-means.  相似文献   

18.
基于模糊划分测度的聚类有效性指标   总被引:1,自引:0,他引:1       下载免费PDF全文
聚类有效性指标用于评价聚类结果的有效性。根据聚类的基本特性,提出了一个新的用于发现最优模糊划分的聚类有效性指标,该有效性指标采用模糊划分测度和信息熵两个重要因子来评价模糊聚类的有效性。其中,模糊划分测度用于评价聚类的类内紧致性与类间分离性,而信息熵则反映了模糊聚类划分结果的不确定性程度。实验结果表明,该聚类有效性指标能对模糊聚类结果的有效性进行正确的评价,特别是对于空间数据的聚类有效性评价,同其他有效性指标相比,它不仅能得到最优的模糊划分,而且对权重系数也是不敏感的。  相似文献   

19.
Clustering algorithms tend to generate clusters even when applied to random data. This paper provides a semi-tutorial review of the state-of-the-art in cluster validity, or the verification of results from clustering algorithms. The paper covers ways of measuring clustering tendency, the fit of hierarchical and partitional structures and indices of compactness and isolation for individual clusters. Included are structural criteria for validating clusters and the factors involved in choosing criteria, according to which the literature of cluster validity is classified. An application to speaker identification demonstrates several indices. The development of new clustering techniques and the wide availability of clustering programs necessitates vigorous research in cluster validity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号