首页 | 本学科首页   官方微博 | 高级检索  
     

WEKA数据挖掘平台及其二次开发
引用本文:陈慧萍,林莉莉,王建东,苗新蕊. WEKA数据挖掘平台及其二次开发[J]. 计算机工程与应用, 2008, 44(19): 76-79. DOI: 10.3778/j.issn.1002-8331.2008.19.022
作者姓名:陈慧萍  林莉莉  王建东  苗新蕊
作者单位:1.河海大学 计算机信息工程学院,江苏 常州 213022 2.南京航空航天大学 信息学院,南京 210016
基金项目:国家重点基础研究发展计划(973计划)
摘    要:在开源数据挖掘平台WEKA上进行了挖掘测试和分析,并分析了其存在的主要问题。为了克服WEKA系统在聚类方面的薄弱性,在WEKA的开源环境下进行二次开发,扩充了聚类算法。介绍了将k-中心点轮换算法嵌入到WEKA平台的过程,充分利用了开源WEKA中的类和可视化功能,并对嵌入的算法和原有聚类算法进行了对比分析。该算法改进了传统的k-中心点算法,避免陷入局部最优,而且它对初始点不太敏感,可以获取更好的聚类效果。

关 键 词:数据挖掘  WEKA平台  聚类  k-中心点轮换算法  
收稿时间:2007-09-03
修稿时间:2008-2-22 

Data mining platform-WEKA and secondary development on WEKA
CHEN Hui-ping,LIN Li-li,WANG Jian-dong,MIAO Xin-rui. Data mining platform-WEKA and secondary development on WEKA[J]. Computer Engineering and Applications, 2008, 44(19): 76-79. DOI: 10.3778/j.issn.1002-8331.2008.19.022
Authors:CHEN Hui-ping  LIN Li-li  WANG Jian-dong  MIAO Xin-rui
Affiliation:1.Computer & Information Engineering College,Hohai University,Changzhou,Jiangsu 213022,China 2.College of Information Science & Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China
Abstract:The paper does some tests about data mining on WEKA which is an open source data mining tool,and analyzes the test results and indicates the problems of the WEKA system.In order to overcome the weakness of clustering in the WEKA system,the paper makes secondary development under the WEKA platform to extend the clustering algorithms.The paper introduces the process of embedding the k-medoids substitution method into the WEKA in which the classes and visualization functions of open source WEKA are fully utilized.The paper makes comparison between the embedded algorithm and initial algorithm.The k-medoids substitution method improves the accuracy on the traditional k-medoids method,preventing it from getting into partial optimal solution.Moreover,this method is insensitive to the initial points,with obtaining better clustering results.
Keywords:data mining  WEKA platform  clustering  k-medoids substitution algorithm
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号