首页 | 本学科首页   官方微博 | 高级检索  
     

基于MapReduce计算模型的并行关联规则挖掘算法研究综述
引用本文:肖文,胡娟,周晓峰.基于MapReduce计算模型的并行关联规则挖掘算法研究综述[J].计算机应用研究,2018,35(1).
作者姓名:肖文  胡娟  周晓峰
作者单位:河海大学文天学院 电气信息工程系,河海大学文天学院 电气信息工程系,河海大学 计算机与信息学院
摘    要:关联规则挖掘是最常用、最重要的数据挖掘任务之一,经典的关联规则挖掘算法有Apriori、FP-Growth、Eclat等。随着数据的爆炸式增长,传统的算法已不能适应大数据挖掘的需要,需要分布式、并行的关联规则挖掘算法来解决上述问题。MapReduce是一种流行的分布式并行计算模型,因其使用简单、伸缩性好、自动负载均衡和自动容错等优点,得到了广泛的应用。本文对已有的基于MapReduce计算模型的并行关联规则挖掘算法进行了分类和综述,对其各自的优缺点和适用范围进行了总结,并对下一步的研究进行了展望。

关 键 词:数据挖掘  关联规则挖掘  频繁项集  并行  MapReduce  Hadoop
收稿时间:2016/12/13 0:00:00
修稿时间:2017/11/14 0:00:00

Parallel Association Rules Mining Algorithm Based on MapReduce : A Survey
XIAO Wen,HU Juan and ZHOU XiaoFeng.Parallel Association Rules Mining Algorithm Based on MapReduce : A Survey[J].Application Research of Computers,2018,35(1).
Authors:XIAO Wen  HU Juan and ZHOU XiaoFeng
Affiliation:School of Computer and Information,HOHAI University,,
Abstract:Association rule mining is one of the most commonly used, important task of data mining, classic algorithm for mining association rules are Apriori, FP - Growth, Eclat, etc. With the explosive growth of data, traditional algorithms cannot meet the needs of the large data mining, Distributed, parallel algorithm for mining association rules are needed to solve the problem of mining association rules in large data.MapReduce is a kind of popular distributed parallel computing model, Because of its simple to use, good scalability, the advantages of automatic load balancing and fault tolerance, has been widely used. In this paper, the existing parallel algorithm for association rules minging based on mapreduce are classified and reviewed, to their respective advantages and disadvantages and scope of application are summarized, and the next research is prospected.
Keywords:Data Mining  Association Rules Mining  Frequent Itemset  Parallel  MapReduce  Hadoop
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号