首页 | 本学科首页   官方微博 | 高级检索  
     

基于Bigtable与MapReduce的Apriori算法改进
引用本文:魏玲,魏永江,高长元.基于Bigtable与MapReduce的Apriori算法改进[J].计算机科学,2015,42(10):208-210, 243.
作者姓名:魏玲  魏永江  高长元
作者单位:哈尔滨理工大学管理信息系统重点实验室 哈尔滨150000,哈尔滨理工大学管理信息系统重点实验室 哈尔滨150000,哈尔滨理工大学管理信息系统重点实验室 哈尔滨150000
基金项目:本文受国家自然科学基金资助
摘    要:为提高Apriori算法挖掘频繁项目集的效率,引进了Bigtable技术与MapReduce模型来对Apriori算法进行优化,设计出大数据环境下挖掘频繁项目集的新算法BM-Apriori算法。与单纯基于MapReduce模型的Apriori改进算法相比,新算法利用Bigtable的时间戳属性代替了键/值对的产生,只需扫描数据库一次即可,节约了模式匹配的时间。同时,BM-Apriori算法在项集列表中新增事务标号列,自动获取事务标号以计算支持度。将BM-Apriori算法在Hadoop平台上进行了实验,结果表明Bigtable技术的融入使得BM-Apriori算法具有更高的效率与可拓展性。

关 键 词:Apriori算法  Bigtable  MapReduce  大数据
收稿时间:2014/10/15 0:00:00
修稿时间:2014/12/25 0:00:00

Improved Apriori Algorithm Based on Bigtable and MapReduce
WEI Ling,WEI Yong-jiang and GAO Chang-yuan.Improved Apriori Algorithm Based on Bigtable and MapReduce[J].Computer Science,2015,42(10):208-210, 243.
Authors:WEI Ling  WEI Yong-jiang and GAO Chang-yuan
Affiliation:Key Laboratory of Management Information System,Harbin University of Science and Technology,Harbin 150000,China,Key Laboratory of Management Information System,Harbin University of Science and Technology,Harbin 150000,China and Key Laboratory of Management Information System,Harbin University of Science and Technology,Harbin 150000,China
Abstract:BM-Apriori algorithm was designed for big data to address the poor efficiency problem of Apriori in mining frequent item sets.BM-Apriori takes advantages of Bigtable and MapReduce together to optimize Apriori algorithm.Compared with the improved Apriori algorithm simply based on MapReduce model,timestamp of Bigtable is utilized in this algorithm to avoid generating a large number of key/value pairs.It saves the pattern matching time and scans the database only once.Also,to obtain transaction marks automatically,transaction mark column is added to set list for computing support numbers.BM-Apriori was executed on Hadoop platform.The experimental results show that BM-Apriori has higher efficiency and scalability.
Keywords:Apriori algorithm  Bigtable  MapReduce  Big data
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号