一种基于MapReduce的频繁闭项集挖掘算法 Closed Frequent Itemset Mining Based on MapReduce期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种基于MapReduce的频繁闭项集挖掘算法

引用本文：	陈光鹏,杨育彬,高阳,商琳.一种基于MapReduce的频繁闭项集挖掘算法[J].模式识别与人工智能,2012,25(2):220-224.

作者姓名：	陈光鹏杨育彬高阳商琳

作者单位：	南京大学计算机软件新技术国家重点实验室南京210093

基金项目：	国家自然科学基金项目(No.61035003,60875011,60721002);国家973计划项目(No.2010CB327903);科技部国际科技合作计划项目(No.2010DFA11030);江苏省自然科学基金项目(No.BK2010054)资助

摘要：	频繁闭项集的挖掘是发现数据项之间关联规则的一种有效方式。当前以MapReduce模式为基础的云计算平台为解决海量数据中的关联规则挖掘问题提供新的解决思路。文中提出并实现一种基于Hadoop云计算平台的频繁闭项集的并行挖掘算法。该算法主要包括并行计数、构造全局频繁项表、并行挖掘局部频繁闭项集和并行筛选全局频繁闭项集四个步骤。在多个数据集上的实验表明，该方法能较大提高数据挖掘的效率，具有较好的加速比。
关键词：	云计算并行算法数据挖掘频繁闭项集 MapReduce
收稿时间：	2011-02-14
Closed Frequent Itemset Mining Based on MapReduce

CHEN Guang-Peng , YANG Yu-Bin , GAO Yang , SHANG Lin.Closed Frequent Itemset Mining Based on MapReduce[J].Pattern Recognition and Artificial Intelligence,2012,25(2):220-224.

Authors:	CHEN Guang-Peng YANG Yu-Bin GAO Yang SHANG Lin

Affiliation:	State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210093

Abstract:	Closed frequent itemset mining is an useful way for discovering association rules from data.Cloud computing infrastructure based on MapReduce provides a promising solution to address the problem.A parallel algorithm for mining closed frequent itemset is presented based on the Hadoop cloud computing platform.The method consists of four steps:parallel counting,global F-List constructing,parallel mining of local closed frequent itemset and parallel filtrating of global closed frequent itemset.The experimental results validate the method and show that it is effective with a satisfied speedup.

Keywords:	Cloud Computing Parallel Algorithm Data Mining Closed Frequent Itemset MapReduce
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《模式识别与人工智能》浏览原始摘要信息
	点击此处可从《模式识别与人工智能》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏