首页 | 本学科首页   官方微博 | 高级检索  
     

EFCS-Grid内基于P2P的分布聚类分析处理策略的研究
引用本文:申德荣,姜安琦,王广奇,杨柄恒,于戈.EFCS-Grid内基于P2P的分布聚类分析处理策略的研究[J].小型微型计算机系统,2007,28(8):1419-1422.
作者姓名:申德荣  姜安琦  王广奇  杨柄恒  于戈
作者单位:东北大学,信息科学与工程学院,辽宁,沈阳,110004
基金项目:国家自然科学基金;国家高技术研究发展计划(863计划)
摘    要:基于数据挖掘的数据可视化是将大数据量展示给用户的一种有效手段.在EFCS-Grid中,基于特定属性的k-平均聚类分析算法进行聚类分析,之后将聚类结果展示给用户.本文通过实验测试并分析了多用户下的采用服务器进行聚类分析的时间代价以及EFCS-Grid系统在不同压力情况下的数据处理的总时间代价,得出了聚类分析在系统的数据处理过程中占重要比重,并随着数据量和并发用户数的增加,系统的性能急剧下降.为此,本文结合P2P体系结构,提出了采用分布式聚类分析数据的处理策略,并将数据处理分为数据合成层和数据分析层.由数据合成层实现数据的整合,保证合成后的数据满足用户的模式需求,之后,在相同模式的基础上实现数据的一次聚类分析和二次聚类分析,达到了通过利用P2P的分布计算能力,缓解集中处理瓶颈和提高网格内数据处理的效率的目的.

关 键 词:分布聚类分析  可视化  数据库网格
文章编号:1000-1220(2007)08-1419-04
修稿时间:2006-05-24

Study on Distributed Clustering Analysis Processing Strategy Based on P2P in EFCS-Grid
SHEN De-rong,JIANG An-qi,WANG Guang-qi,YANG Bing-heng,YU Ge.Study on Distributed Clustering Analysis Processing Strategy Based on P2P in EFCS-Grid[J].Mini-micro Systems,2007,28(8):1419-1422.
Authors:SHEN De-rong  JIANG An-qi  WANG Guang-qi  YANG Bing-heng  YU Ge
Abstract:Data visualization by means of data mining is an effective method for presenting an amount of data to users. In EFCS-Grid, data are clustered by using k-mean based on given attributes, then to present the result to users. This paper tests and analyzes the time cost of data clustering analysis implementing at centralized server in case of multi-users, and the total time cost of data processing in EFCS-Grid in different conditions, and draw an conclusion that the time cost of data clustering analysis is more important in that of data process in EFCS-Grid, and the performance of the grid system will become lower greatly with the data amount processed increasing. Thus a distributed clustering analysis strategy is proposed by means of P2P structure, in which data integration layer and data clustering layer are included, the former one integrates the partial data into a complete schema on users demand,then to implement two times data clustering analysis to reach the purposes of easing the deficiency of centralized clustering analysis and improving the efficiency of data clustering by using the distributed computing capability of P2P structure.
Keywords:P2P
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号