首页 | 本学科首页   官方微博 | 高级检索  
     

基于频繁概念直乘分布的全局闭频繁项集挖掘算法
引用本文:柴玉梅,张卓,王黎明.基于频繁概念直乘分布的全局闭频繁项集挖掘算法[J].计算机学报,2012,35(5):990-1001.
作者姓名:柴玉梅  张卓  王黎明
作者单位:郑州大学信息工程学院 郑州450001
摘    要:基于概念格的集中式数据挖掘算法,不能充分地利用分布式计算资源来改善概念格构造效率,从而影响了挖掘算法的性能.文中进一步分析了Iceberg概念格并置集成的内在并行特性;以频繁概念直乘及其下覆盖为最小粒度,对Iceberg概念格并置集成过程进行分解和分布式计算;在对其正确性理论证明的基础上,提出了一个新颖的异构分布式环境下闭频繁项集全局挖掘算法.此算法利用Iceberg概念格的半格以及可并置集成特性,充分发挥了分布式环境下计算资源的优势.实验证明,在稠密数据集和稀疏数据集上,该挖掘算法都表现出较好的性能.

关 键 词:Iceberg概念格  分布式数据挖掘  并置集成  异构数据库  闭频繁项集

An Algorithm for Mining Global Closed Frequent Itemsets Based on Distributed Frequent Concept Direct Product
CHAI Yu-Mei , ZHANG Zhuo , WANG Li-Ming.An Algorithm for Mining Global Closed Frequent Itemsets Based on Distributed Frequent Concept Direct Product[J].Chinese Journal of Computers,2012,35(5):990-1001.
Authors:CHAI Yu-Mei  ZHANG Zhuo  WANG Li-Ming
Affiliation:(School of Information Engineering,Zhengzhou University,Zhengzhou 450001)
Abstract:With increasing distributed computing environment applied extensively,traditional center data mining algorithms which are based on concept lattice could not take full advantage of distributed computing resources to improve the time efficiency of constructing concept lattice.In consequence,the performance of mining algorithms could be affected.In this paper,we firstly further analyze the deep underlying parallel features of apposition assembly of Iceberg concept lattice.Secondly,we consider the sets which are consisted of the frequent concept direct produce and its lower cover as minimal computing units.And then those units can be scattered,handled distributively,and finally integrated into a global Iceberg concept lattice.The procedure of distributed assembly of Iceberg concept lattice is theoretically proved correct.Based on above works,a new algorithm is proposed to mine global closed frequent itemsets in heterogeneous distributed computing environment.This algorithm exploits the good quality of semi-lattice and apposition assembly construction,both of which are induced by Iceberg concept lattice.Therefore the algorithm has the ability to make the most of advantage of the computing sources in the distributed environment.It shows excellent efficiency of global data mining under both dense and sparse heterogeneous distributed data sets in experiments.
Keywords:Iceberg concept lattice  distributed data mining  apposition assembly  heterogeneous data scenario  closed frequent itemsets
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号