首页 | 本学科首页   官方微博 | 高级检索  
     

可扩展双向聚类算法在煤炭领域中的研究与实现
引用本文:魏玲,刘运朋,邢继昕.可扩展双向聚类算法在煤炭领域中的研究与实现[J].煤炭技术,2013(5):195-198.
作者姓名:魏玲  刘运朋  邢继昕
作者单位:廊坊职业技术学院
摘    要:随着现代信息技术的发展,各行各业产生了大量的高维数据,用不同的属性描述数据。煤炭产业也产生了大量的高维数据。煤炭企业的管理者往往希望从这些海量高维数据中得到更多的隐藏价值的知识。双向聚类算法被广泛地应用在各个领域中,该算法能够准确地完成聚类。随着数据规模呈现指数级地增长以及数据维度的增加,传统双向聚类算法不仅不能快速完成数据聚类,而且不能有效地处理高维数据。文章针对海量高维数据,提出了可扩展的基于高维数据的分布式双向聚类算法。通过实验结果展示了文中提出的算法具有很好的聚类结果以及很高的加速比及可扩展性。

关 键 词:可扩展性  双向聚类  云平台  高维数据  海量数据

Research and Implementation of Scalable Bi-clustering Algorithm in Coal Area
WEI Ling,LIU Yun-peng,XING Ji-xin.Research and Implementation of Scalable Bi-clustering Algorithm in Coal Area[J].Coal Technology,2013(5):195-198.
Authors:WEI Ling  LIU Yun-peng  XING Ji-xin
Affiliation:(Langfang Polytechnic Institute,Langfang 065000,China)
Abstract:With the development of modern information technology,it produces huge scale high dimensional data in different areas,and they describe the data with different attributes.In coal area,it also produces huge scale coal data.The manager of coal industry usually hopesto get more hidden valuable knowledge from this huge scale high dimensional data.Bi-clustering algorithm is widely applied into different areas,and it could complete clustering accurately.With the exponentially increment of data size and data dimension,traditional bi-clustering algorithm not only could not complete data clustering fast,but also could not deal with high dimensional data effectively.In this paper,focusing on huge scale high dimensional data,we propose scalable distributed co-clustering algorithm based on high dimensional data.Through the experimental results,it shows that the algorithm in this paper has good clustering result,high speed-up and good scalability.
Keywords:scalability  bi-clustering  cloud platform  high dimensional data  huge scale data  
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号