首页 | 本学科首页   官方微博 | 高级检索  
     

BIRCH聚类算法优化及并行化研究
引用本文:朱映辉,江玉珍. BIRCH聚类算法优化及并行化研究[J]. 计算机工程与设计, 2007, 28(18): 4345-4346,4369
作者姓名:朱映辉  江玉珍
作者单位:韩山师范学院,数学与信息技术学院,广东,潮州,521041;韩山师范学院,数学与信息技术学院,广东,潮州,521041
摘    要:
为了提高聚类质量,针对BIRCH算法中在聚类精度方面所存在的不足,提出了聚类特征树中的不同簇应使用不同阀值的思想,较好地改善了对体积相差悬殊的簇不能很好聚类的问题.并且深入地研究和分析了如何在集群系统中进行快速聚类,提出了自定义数据类型、采用数据并行思想和非均匀数据划分策略等几点改进意见.最后实验结果表明,通过改进能够获得比较理想的运行时间和加速比性能.

关 键 词:集群  数据挖掘  聚类  聚类质量  并行化
文章编号:1000-7024(2007)18-4345-02
修稿时间:2006-10-10

Research of BIRCH clustering algorithm optimization and parallelism
ZHU Ying-hui,JIANG Yu-zhen. Research of BIRCH clustering algorithm optimization and parallelism[J]. Computer Engineering and Design, 2007, 28(18): 4345-4346,4369
Authors:ZHU Ying-hui  JIANG Yu-zhen
Affiliation:College of Mathematics and Information Technology, Hanshan Normal University, Chaozhou 521041, China
Abstract:
To improve the quality of clustering, considering the insufficiency of clustering precision which exists in the BIRCH algorithm, the idea of different threshold should be set in different cluster in CF-tree is implemented. An in-depth study and analysis is carried out on how to accelerate clustering in cluster system. Subsequently, some creative schemes such as custom datatype, data parallelism, and asymmetric data-partition are put forward. Finally, a result of the better performance is obtained after the improvement is verified by an experiment.
Keywords:cluster  data mining  clustering  quality of clustering  parallelism
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号