共查询到20条相似文献,搜索用时 62 毫秒
1.
数据挖掘中聚类分析的技术方法 总被引:1,自引:0,他引:1
数据挖掘是信息产业界近年来非常热门的研究方向,聚类分析是数据挖掘中的核心技术。对各种聚类算法进行了分类,对代表算法作了详细的分析,并对这些算法从多个方面进行了比较,从而为研究和在不同领域使用这些算法提供了参考。同时还阐述了聚类分析在数据挖掘中的应用。 相似文献
2.
3.
数据挖掘中聚类分析的技术方法 总被引:31,自引:21,他引:31
数据挖掘是信息产业界近年来非常热门的研究方向,聚类分析是数据挖掘中的核心技术,本文对数据挖掘领域的聚类分析方法及代表算法进行分析,并从多个方面对这些算法性能进行比较,同时还对聚类分析在数据挖掘中的几个应用进行了阐述。 相似文献
4.
聚类分析方法及工具应用研究 总被引:2,自引:0,他引:2
聚类是数据挖掘领域的一个重要的研究方向。本文介绍了聚类的基本概念及主要方法,通过具体实例对当今国际上先进的数据挖掘工具(SPSS和DBiner)聚类的性能进行了对比,最后得出了结论。 相似文献
5.
6.
7.
基于聚类分析的K-means算法研究及应用 总被引:12,自引:0,他引:12
通过对聚类分析及其算法的论述,从多个方面对这些算法性能进行比较,同时以儿童生长发育时期的数据为例通过聚类分析的软件和改进的K-means算法来进一步阐述聚类分析在数据挖掘中的实践应用. 相似文献
8.
空间数据挖掘是数据挖掘的一个研究分支。空间聚类分析是空间数据挖掘的一个重要的研究领域。传统的K-均值方法用于聚类具有收敛速度快、算法实现简单等特点,但容易陷入局部最优,并对初始解敏感。遗传算法是一种全局搜索算法,但是收敛速度较慢。提出一种改进的遗传算法进行聚类,该算法通过全局搜索与局部搜索相结合,取得较好效果。实验表明:文中提出的算法在聚类分析中搜索到全局最优解(或近似全局最优解)的能力要优于经典的K-均值聚类算法,且局部收敛速度和全局收敛性能较好。 相似文献
9.
10.
11.
Youji Fukada 《Pattern recognition》1980,12(6):395-403
This paper describes two clustering procedures for region analysis of image data and discusses the security of these algorithms theoretically. First our algorithms find kernels of regions and then classify pixels into regions using these kernels. The first algorithm distinguishes the regions that have far more distances than the given distance and the second algorithm distinguishes C regions that are great distances from each other in the feature space. These parameters are criteria which decide whether regions are similar or dissimilar. Examples are presented in order to show how these algorithms work for real image data. 相似文献
12.
Gabriela Moise Arthur Zimek Peer Kröger Hans-Peter Kriegel Jörg Sander 《Knowledge and Information Systems》2009,21(3):299-326
Subspace and projected clustering have emerged as a possible solution to the challenges associated with clustering in high-dimensional
data. Numerous subspace and projected clustering techniques have been proposed in the literature. A comprehensive evaluation
of their advantages and disadvantages is urgently needed. In this paper, we evaluate systematically state-of-the-art subspace
and projected clustering techniques under a wide range of experimental settings. We discuss the observed performance of the
compared techniques, and we make recommendations regarding what type of techniques are suitable for what kind of problems. 相似文献
13.
14.
针对K-prototypes聚类算法处理混合型入侵检测数据时易陷入局部最优且对初始值敏感的问题,提出了一种基于K-prototypes与模糊评判相结合的入侵检测方法,利用K-prototypes对数据进行统计归类,在聚类中建立模糊评判模型,从统计和特征两方面对数据进行双重判定。实验结果表明两种算法的有效结合,可以提高任一种算法单独使用时的检测性能,有效地提高了检测率,降低了误检率。 相似文献
15.
Bulent Tutmez 《Applied Soft Computing》2012,12(1):1-13
Fuzzy clustering based regression analysis is a novel hybrid approach to capture the linear structure while considering the classification structure of the measurement. Using the concept that weights provided via the fuzzy degree of clustering, some regression models have been proposed in literature. In these models, membership values derived from clustering or some weights obtained from geometrical functions are employed as the weights of regression system. This paper addresses a weighted fuzzy regression analysis based on spatial dependence measure of the memberships. By the methodology presented in this paper, the relative weights are used in fuzzy regression models instead of direct membership values or their geometrical transforms. The experimental studies indicate that the spatial dependence based analyses yield more reliable results to show the correlation of the independent variables into the dependent variable. In addition, it has been observed that spatial dependence based models have high estimation and generalization capacities. 相似文献
16.
基于遗传算法的模糊聚类研究及其应用 总被引:4,自引:0,他引:4
为了克服传统聚类算法对初始化敏感的缺点,提出了一种基于增强型遗传算法的模糊聚类方法。它把遗传结束的准则与传统算法的终止准则有机地结合起来,不仅提高了算法的聚类分析性能,也提高了算法的收敛速度。比盲目的搜索效率要高,也比专门的针对特定问题的算法通用性强。通过在国内一家大型乳业集团的HRM系统中的成功运用,说明了该算法的有效性和通用性。 相似文献
17.
在智能系统的研究与开发中,聚类分析是一个非常重要的问题。提出了一个基于未知度和核的Vague集间的相似度量公式。在考虑算法自主性和计算复杂性的基础之上,通过参考Fuzzy集中的相关聚类分析方法,给出了一种以Vague集的相似度量为评价准则的直接聚类算法。使用相似度量公式,分别采用Vague 传递闭包法和Vague 直接聚类法进行计算,实验结果表明,基于Vague 相似度量的直接聚类法计算简单,不会造成原始信息的失真,而且对数据量的大小均无特别的要求,比Vague 传递闭包法更加有效。 相似文献
18.
Neural Computing and Applications - Movie box-office research is an important work for the rapid development of the film industry, and it is also a challenging task. Our study focuses on finding... 相似文献
19.
为了解决城市区域路网交通状态的时空分析问题,提出了一种基于模糊C均值聚类(FCM)道路交通状态判别模型及分析方法。通过路网的空间单元交通状态的定量分析和对大量的历史数据进行FCM分析,挖掘出各空间单元的各类交通状态的聚类中心,并将实时采集的交通数据与聚类中心进行匹配,评判其实时交通状态,最后根据空间单元在路网空间分布,获得各状态下点、线、面的空间分层分析结果。实例结果表明,判别方法能准确地实现区域路网的交通状态时空判别,为交通精细化管理提供辅助决策信息。 相似文献
20.
A conceptual problem that appears in different contexts of clustering analysis is that of measuring the degree of compatibility between two sequences of numbers. This problem is usually addressed by means of numerical indexes referred to as sequence correlation indexes. This paper elaborates on why some specific sequence correlation indexes may not be good choices depending on the application scenario in hand. A variant of the Product-Moment correlation coefficient and a weighted formulation for the Goodman-Kruskal and Kendall’s indexes are derived that may be more appropriate for some particular application scenarios. The proposed and existing indexes are analyzed from different perspectives, such as their sensitivity to the ranks and magnitudes of the sequences under evaluation, among other relevant aspects of the problem. The results help suggesting scenarios within the context of clustering analysis that are possibly more appropriate for the application of each index. 相似文献