期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

田生文黄明明《计算机工程与设计》2007,28(2):436-439

针对模糊C-均值聚类算法偏好发现球形簇,以及对孤立点非常敏感的问题,提出了密集簇中心二次模糊聚类算法,其中引入聚类有效性度量函数,并进行了有效的孤立点处理,最终的模糊簇由多个代表点共同表示,故算法可有效发现数据集中的自然簇数目,对簇的大小和形状没有偏好性,且在孤立点的处理上具有较好的健壮性.另外,随机采样过程方便地实现了上述算法在大型数据集上的扩展;与模糊C-均值聚类算法的实验结果比较也表明了该算法的优越性. 相似文献

2.

Determining number of clusters and prototype locations via multi-scale clustering

Eiji Nakamura Nasser Kehtarnavaz 《Pattern recognition letters》1998,19(14):1265-1283

In clustering algorithms, it is usually assumed that the number of clusters is known or given. In the absence of such a priori information, a procedure is needed to find an appropriate number of clusters. This paper presents a clustering algorithm that incorporates a mechanism for finding the appropriate number of clusters as well as the locations of cluster prototypes. This algorithm, called multi-scale clustering, is based on scale-space theory by considering that any prominent data structure ought to survive over many scales. The number of clusters as well as the locations of cluster prototypes are found in an objective manner by defining and using lifetime and drift speed clustering criteria. The outcome of this algorithm does not depend on the initial prototype locations that affect the outcome of many clustering algorithms. As an application of this algorithm, it is used to enhance the Hough transform technique. 相似文献

3.

On cluster validity index for estimation of the optimal number of fuzzy clusters

Dae-Won Kim Author Vitae Kwang H. Lee Author Vitae Doheon Lee Author Vitae 《Pattern recognition》2004,37(10):2009-2025

A new cluster validity index is proposed that determines the optimal partition and optimal number of clusters for fuzzy partitions obtained from the fuzzy c-means algorithm. The proposed validity index exploits an overlap measure and a separation measure between clusters. The overlap measure, which indicates the degree of overlap between fuzzy clusters, is obtained by computing an inter-cluster overlap. The separation measure, which indicates the isolation distance between fuzzy clusters, is obtained by computing a distance between fuzzy clusters. A good fuzzy partition is expected to have a low degree of overlap and a larger separation distance. Testing of the proposed index and nine previously formulated indexes on well-known data sets showed the superior effectiveness and reliability of the proposed index in comparison to other indexes. 相似文献

4.

基于层次分析法的模糊分类优选模型 总被引：1，自引：0，他引：1

李春生王耀南陈光辉蒋宏锋《控制与决策》2009,24(12)

不同的模糊分类算法在同一个数据集合上常会产生不同的模糊分类．究竟哪种方法最能揭示数据的真实结构,对此,以模糊分类有效性指标为评价指标,应用层次分析法对各模糊分类进行综合评价,建立了一个模糊分类优选模型．大量实验表明,该优选模型所选出的最优模糊分类,其模式识别率高,能揭示数据的真实结构．相似文献

5.

模糊聚类有效性的研究进展 总被引：1，自引：1，他引：1

唐明会杨燕《计算机工程与科学》2009,31(9)

聚类有效性评价对聚类分析具有重要意义,是聚类分析的瓶颈之一。本文从基于数据集模糊划分的方法和基于数据集几何结构的方法两方面,归纳综述了常用的模糊聚类有效性评价函数,并讨论了模糊聚类最佳类别数的自动确定问题。相似文献

6.

下载免费PDF全文

Daoqiang Zhang Songcan Chen Zhi-Hua Zhou 《International Journal of Software and Informatics》2007,1(1):67-84

In this paper, the well-known competitive clustering algorithm (CA) is revisited and reformulated from a point of view of entropy minimization. That is, the second term of the objective function in CA can be seen as quadratic or second-order entropy. Along this novel explanation, two generalized competitive clustering algorithms inspired by Renyi entropy and Shannon entropy, i.e. RECA and SECA, are respectively proposed in this paper.Simulation results show that CA requires a large number of initial clusters to obtain the right number of clusters, while RECA and SECA require small and moderate number of initial clusters respectively. Also the iteration steps in RECA and SECA are less than that of CA.Further CA and RECA are generalized to CA-p and RECA-p by using the p-order entropy and Renyi's p-order entropy in CA and RECA respectively. Simulation results show that the value of phas a great impact on the performance of CA-p, whereas it has little in uence on that of RECA-p. 相似文献

7.

基于语义的中文文本聚类最佳簇数研究

刘金岭《计算机工程与设计》2010,31(9)

分析了聚类数目的确定对大样本数据聚类效果的影响,对目前聚类质量衡量指标的几个主要流行观点进行了剖析.利用文本相似度的概念对文本语义最佳聚类数问题进行了研究,提出了一种基于聚类过程的丈本最佳聚类数算法CTBP,其主要思想是在文本向量集的每个文本向量中抽取出一个词汇,按相似度有序排列,用增量逐层划分以得到最优划分所对应的簇类数.这样通过扫描一遍数据就可以获得多个统计信息,最后求出最优解.实验结果表明了该算法的高质量和高效率. 相似文献

8.

新模糊聚类有效性指标

耿嘉艺钱雪忠周世兵《计算机应用研究》2019,36(4)

模糊聚类是模式识别、机器学习和图像处理等领域的重要研究内容。模糊C-均值聚类算法是最常用的模糊聚类实现算法,该算法需要预先给定聚类数才能对数据集进行聚类。提出了一种新的聚类有效性指标,对聚类结果进行有效性验证。该指标从划分熵、隶属度、几何结构角度,定义了紧凑度、分离度、重叠度三个重要特征测量。在此基础上,提出了一种最佳聚类数确定方法。将新聚类有效性指标和传统有效性指标在6个人工数据集和3个真实数据集进行实验验证。实验结果表明,所提出的指标和方法能够有效地对聚类结果进行评估,适合确定样本的最佳聚类数。相似文献

9.

基于重叠度增量的模糊聚类有效性函数

欧卫华《计算技术与自动化》2009,28(4):115-118

提出用重叠度来刻画模糊类间的距离,在此基础上针对模糊划分总重叠度有随类数增加而单调递增的趋势,提出基于重叠度增量的聚类有效性函数。该算法由重叠度增量最大值来确定最佳聚类数,不但克服了传统有效性函数的单调问题,而且计算简单。基于模糊C-均值聚类算法（FCM）,应用多组测试数据对其进行性能分析,并与当前广泛应用且具代表性的有效性函数进行深入比较。仿真结果表明,该函数的有效性和优越性。相似文献

10.

K-means算法最佳聚类数确定方法 总被引：10，自引：0，他引：10

周世兵徐振源唐旭清《计算机应用》2010,30(8):1995-1998

K-means聚类算法是以确定的类数k为前提对数据集进行聚类的,通常聚类数事先无法确定。从样本几何结构的角度设计了一种新的聚类有效性指标,在此基础上提出了一种新的确定K-means算法最佳聚类数的方法。理论研究和实验结果验证了以上算法方案的有效性和良好性能。相似文献