首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
自适应仿射传播聚类   总被引:42,自引:4,他引:42  
王开军  张军英  李丹  张新娜  郭涛 《自动化学报》2007,33(12):1242-1246
适合处理大类数的仿射传播聚类有两个尚未解决的问题: 一是很难确定偏向参数取何值能够使算法产生最优的聚类结果; 另一个是当震荡发生后算法不能自动消除震荡并收敛. 为了解决这两个问题, 提出了自适应仿射传播聚类方法, 具体技术包括: 自适应扫描偏向参数空间来搜索聚类个数空间以寻找最优聚类结果、自适应调整阻尼因子来消除震荡以及当调整阻尼因子方法失效时的自适应逃离震荡技术. 与原算法相比, 自适应仿射传播聚类方法性能更优, 能够自动消除震荡和寻找最优聚类结果. 对模拟和真实数据集的实验结果表明, 自适应仿射传播聚类方法十分有效, 其聚类质量优于或不低于原算法.  相似文献   

2.
聚类算法在银行客户细分中的应用   总被引:2,自引:0,他引:2       下载免费PDF全文
针对聚类算法在金融领域广泛应用的实际情况,基于银行客户数据集,对DBSCAN, K-means和X-means 3种聚类算法在执行效率、可扩展性、异常点检测能力等方面进行对比分析,并提出将X-means算法应用于银行业客户细分。利用X-means算法建立了一套银行客户细分模型,为银行决策者提供科学的决策支持。  相似文献   

3.
聚类算法在电信客户细分中的应用研究   总被引:2,自引:0,他引:2  
陈治平  胡宇舟  顾学道 《计算机应用》2007,27(10):2566-2569
结合聚类算法的分析,提出了一种解决电信客户细分的应用模型,该模型在实际中得到了较好的应用并为电信服务产品的策划设计提供了依据。同时,通过引入指标区分度的定义,给出了一种聚类方法应用效果评估的方法。该方法结合电信的案例应用与K-Means、SOM、BIRCH等聚类方法结果的分析,得出K-Means方法在电信客户市场细分中的应用优越性。  相似文献   

4.
基于信息熵的蚁群聚类算法在客户细分中的应用①   总被引:1,自引:0,他引:1  
传统的蚁群聚类算法需设置较多参数,且聚类时间较长。基于信息熵的蚁群聚类算法通过信息熵改变蚂蚁拾起和放下数据的规则,减少了参数的设置、缩短了聚类的时间,将其应用于客户细分,并且与采用传统的蚁群聚类算法得到的细分结果进行比较分析,实验表明。基于信息熵的蚁群聚类算法可以加快客户细分的聚类进程。  相似文献   

5.
传统的蚁群聚类算法需设置较多参数,且聚类时间较长。基于信息熵的蚁群聚类算法通过信息熵改变蚂蚁拾起和放下数据的规则,减少了参数的设置、缩短了聚类的时间,将其应用于客户细分,并且与采用传统的蚁群聚类算法得到的细分结果进行比较分析,实验表明。基于信息熵的蚁群聚类算法可以加快客户细分的聚类进程。  相似文献   

6.
传统k-means算法随机选取初始聚类中心使聚类结果不稳定,诸多优化算法的时间复杂度较高,为了提高聚类稳定性并降低时间复杂度,提出了基于个体轮廓系数自适应地选取优秀样本以确定初始聚类中心的改进k-means算法.该算法多次调用传统k-means算法聚类,根据k个类中心的个体轮廓系数以及各样本与类中心的距离,自适应地选取优秀样本,求其均值作为初始聚类中心.在多个UCI数据集上的实验表明,该算法聚类时间短,具有较高的轮廓系数和准确率.  相似文献   

7.
由于传统聚类算法的收敛过早、精度较低等缺点无法满足移动电子商务情境下的客户多样性、动态性、复杂性等特点,在研究典型客户细分领域聚类算法的基础上,提出一种结合不同聚类算法优点的混合聚类算法M-Cluster。针对移动电子商务情境下学生群体的消费模式和群集现象,构建出基于M-Cluster算法的融合LTV和RFM模型优点的CPM模型用于评价和细分客户群。  相似文献   

8.
本文提出了基于信息熵和K均值算法混合迭代模糊聚类的客户细分模型,解决了模糊聚类的原型初始化参数问题.将信息熵和K均值算法引入模糊聚类中进行分析,并结合联通客户的大样本数据进行实际分析,与传统方法相比,取得了较好的效果.  相似文献   

9.
本文首先对聚类算法进行了分析,然后以中小型商业批发企业为例,设计了一种反映客户价值与客户关系质量的客户细分模型,应用K-Means聚类方法进行了实际的挖掘。探讨在中小型企业不能提供完备数据的情况下,只要设计出合理的细分模型并选择合适的算法仍然可以实现有效的客户细分。  相似文献   

10.
如何从海量的样本数据中挖掘出具有共性的顾客需求信息,帮助商家识别出潜在的客户群,并提高对市场活动的响应效率,是当前国内外市场研究的一个重点和难点。以新产品开发策划中的新产品市场定位为研究对象,提出了一种基于网格的聚类算法(meshbased clustering algorithm, MCA)。此算法可以在市场调查数据中深度挖掘顾客潜在需求,根据顾客评价对产品进行竞争性分析,采用积极的营销策略,不断挖掘新客户群体,从而为企业制定新产品开发战略和产品市场定位提供科学决策依据。  相似文献   

11.
Clustering provides a knowledge acquisition method for intelligent systems. This paper proposes a novel data-clustering algorithm, by combining a new initialization technique, K-means algorithm and a new gradual data transformation approach to provide more accurate clustering results than the K-means algorithm and its variants by increasing the clusters’ coherence. The proposed data transformation approach solves the problem of generating empty clusters, which frequently occurs for other clustering algorithms. An efficient method based on the principal component transformation and a modified silhouette algorithm is also proposed in this paper to determine the number of clusters. Several different data sets are used to evaluate the efficacy of the proposed method to deal with the empty cluster generation problem and its accuracy and computational performance in comparison with other K-means based initialization techniques and clustering methods. The developed estimation method for determining the number of clusters is also evaluated and compared with other estimation algorithms. Significances of the proposed method include addressing the limitations of the K-means based clustering and improving the accuracy of clustering as an important method in the field of data mining and expert systems. Application of the proposed method for the knowledge acquisition in time series data such as wind, solar, electric load and stock market provides a pre-processing tool to select the most appropriate data to feed in neural networks or other estimators in use for forecasting such time series. In addition, utilization of the knowledge discovered by the proposed K-means clustering to develop rule based expert systems is one of the main impacts of the proposed method.  相似文献   

12.
Iterative refinement clustering algorithms are widely used in data mining area, but they are sensitive to the initialization. In the past decades, many modified initialization methods have been proposed to reduce the influence of initialization sensitivity problem. The essence of iterative refinement clustering algorithms is the local search method. The big numbers of the local minimum points which are embedded in the search space make the local search problem hard and sensitive to the initialization. The smaller number of local minimum points, the more robust of initialization for a local search algorithm is. In this paper, we propose a Top–Down Clustering algorithm with Smoothing Search Space (TDCS3) to reduce the influence of initialization. The main steps of TDCS3 are to: (1) dynamically reconstruct a series of smoothed search spaces into a hierarchical structure by ‘filling’ the local minimum points; (2) at the top level of the hierarchical structure, an existing iterative refinement clustering algorithm is run with random initialization to generate the clustering result; (3) eventually from the second level to the bottom level of the hierarchical structure, the same clustering algorithm is run with the initialization derived from the previous clustering result. Experiment results on 3 synthetic and 10 real world data sets have shown that TDCS3 has significant effects on finding better, robust clustering result and reducing the impact of initialization.  相似文献   

13.
初始化独立的谱聚类算法   总被引:2,自引:0,他引:2       下载免费PDF全文
谱聚类作为一种新颖的聚类算法近年来受到模式识别领域的广泛关注。针对传统谱聚类算法对初始中心敏感的特点,通过引入对初值不敏感的k-调和平均算法,提出一种初始化独立的谱聚类算法。在人工数据和真实数据上的实验表明,相比于传统的k-means算法、FCM算法和EM算法,改进算法在稳定性和聚类性能上有了显著的提高。  相似文献   

14.
利用FCM求解最佳聚类数的算法   总被引:2,自引:0,他引:2       下载免费PDF全文
利用FCM求解最佳聚类数的算法中,每次应用FCM算法都要重新初始化类中心,而FCM算法对初始类中心敏感,这样使得利用FCM求解最佳聚类数的算法很不稳定。对该算法进行了改进,提出了一个合并函数,使得(c-1)类的类中心依赖于类的类中心。仿真实验表明:新的算法稳定性好,且运算速度明显比旧的算法要快。  相似文献   

15.
Clustering is a useful tool for finding structure in a data set. The mixture likelihood approach to clustering is a popular clustering method, in which the EM algorithm is the most used method. However, the EM algorithm for Gaussian mixture models is quite sensitive to initial values and the number of its components needs to be given a priori. To resolve these drawbacks of the EM, we develop a robust EM clustering algorithm for Gaussian mixture models, first creating a new way to solve these initialization problems. We then construct a schema to automatically obtain an optimal number of clusters. Therefore, the proposed robust EM algorithm is robust to initialization and also different cluster volumes with automatically obtaining an optimal number of clusters. Some experimental examples are used to compare our robust EM algorithm with existing clustering methods. The results demonstrate the superiority and usefulness of our proposed method.  相似文献   

16.
Partitional clustering of categorical data is normally performed by using K-modes clustering algorithm, which works well for large datasets. Even though the design and implementation of K-modes algorithm is simple and efficient, it has the pitfall of randomly choosing the initial cluster centers for invoking every new execution that may lead to non-repeatable clustering results. This paper addresses the randomized center initialization problem of K-modes algorithm by proposing a cluster center initialization algorithm. The proposed algorithm performs multiple clustering of the data based on attribute values in different attributes and yields deterministic modes that are to be used as initial cluster centers. In the paper, we propose a new method for selecting the most relevant attributes, namely Prominent attributes, compare it with another existing method to find Significant attributes for unsupervised learning, and perform multiple clustering of data to find initial cluster centers. The proposed algorithm ensures fixed initial cluster centers and thus repeatable clustering results. The worst-case time complexity of the proposed algorithm is log-linear to the number of data objects. We evaluate the proposed algorithm on several categorical datasets and compared it against random initialization and two other initialization methods, and show that the proposed method performs better in terms of accuracy and time complexity. The initial cluster centers computed by the proposed approach are close to the actual cluster centers of the different data we tested, which leads to faster convergence of K-modes clustering algorithm in conjunction to better clustering results.  相似文献   

17.
K-means type clustering algorithms for mixed data that consists of numeric and categorical attributes suffer from cluster center initialization problem. The final clustering results depend upon the initial cluster centers. Random cluster center initialization is a popular initialization technique. However, clustering results are not consistent with different cluster center initializations. K-Harmonic means clustering algorithm tries to overcome this problem for pure numeric data. In this paper, we extend the K-Harmonic means clustering algorithm for mixed datasets. We propose a definition for a cluster center and a distance measure. These cluster centers and the distance measure are used with the cost function of K-Harmonic means clustering algorithm in the proposed algorithm. Experiments were carried out with pure categorical datasets and mixed datasets. Results suggest that the proposed clustering algorithm is quite insensitive to the cluster center initialization problem. Comparative studies with other clustering algorithms show that the proposed algorithm produce better clustering results.  相似文献   

18.
The leading partitional clustering technique, k-modes, is one of the most computationally efficient clustering methods for categorical data. However, the performance of the k-modes clustering algorithm which converges to numerous local minima strongly depends on initial cluster centers. Currently, most methods of initialization cluster centers are mainly for numerical data. Due to lack of geometry for the categorical data, these methods used in cluster centers initialization for numerical data are not applicable to categorical data. This paper proposes a novel initialization method for categorical data which is implemented to the k-modes algorithm. The method integrates the distance and the density together to select initial cluster centers and overcomes shortcomings of the existing initialization methods for categorical data. Experimental results illustrate the proposed initialization method is effective and can be applied to large data sets for its linear time complexity with respect to the number of data objects.  相似文献   

19.
This paper describes a new soft clustering algorithm in which each cluster is modelled by a one-class support vector machine (OC-SVM). The proposed algorithm extends a previously proposed hard clustering algorithm, also based on OC-SVM representation of clusters. The key building block of our method is the weighted OC-SVM (WOC-SVM), a novel tool introduced in this paper, based on which an expectation-maximization-type soft clustering algorithm is defined. A deterministic annealing version of the algorithm is also introduced, and shown to improve the robustness with respect to initialization. Experimental results show that the proposed soft clustering algorithm outperforms its hard clustering counterpart, namely in terms of robustness with respect to initialization, as well as several other state-of-the-art methods.  相似文献   

20.
针对基于Hub的聚类算法K-hubs算法存在对初始聚类中心敏感的问题,提出一种基于Hub的初始中心选择策略。该策略充分利用高维数据普遍存在的Hubness现象,选择相距最远的K个Hub点作为初始的聚类中心。实验表明采用该策略的K-hubs算法与原来采用随机初始中心的K-hubs算法相比,前者拥有较好的初始中心分布,能够提高聚类准确率,而且初始中心所在的位置倾向于接近最终簇中心,有利于加快算法收敛。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号