首页 | 本学科首页   官方微博 | 高级检索  
     

基于垂直数据分布的大型稠密数据库快速关联规则挖掘算法
引用本文:崔建,李强,杨龙坡.基于垂直数据分布的大型稠密数据库快速关联规则挖掘算法[J].计算机科学,2011,38(4):216-220.
作者姓名:崔建  李强  杨龙坡
作者单位:空军雷达学院预警监视情报系,武汉,430019
基金项目:本文受国家自然科学基金项目(60736009)资助。
摘    要:为进一步解决对大型事务数据库进行关联规则挖掘时产生的CPU时间开销大和I/O操作频繁的问题,给出了一种基于垂直数据分布的改进关联规则挖掘算法,称为VARMLDb算法。该算法首先有效地把数据库分为内存可以满足要求的若干划分,然后结合有向无环图和垂直数据形式diffse、差集来存储和计算频繁项集,极大地减少了存储中间结果所需的内存大小,解决了传统垂直数据挖掘算法对稠密数据库挖掘效率低下的问题,使该算法可有效地适用于大型稠密数据库的关联规则挖掘。整个算法吸取CARMA算法的优势,只需扫描两次数据库便可完成挖掘过程。实验结果表明该算法是正确的,在大型稠密数据库中,VARMLDb算法具有较高的执行效率。

关 键 词:CARMA算法,DAG    diffset差集,垂直数据分布,稠密数据库

Fast Algorithm for Mining Association Rules Based on Vertically Distributed Data in Large Dense Databases
CUI Jian,LI Qiang,YANG Long-po.Fast Algorithm for Mining Association Rules Based on Vertically Distributed Data in Large Dense Databases[J].Computer Science,2011,38(4):216-220.
Authors:CUI Jian  LI Qiang  YANG Long-po
Affiliation:(Department of Early Warning Surveillance Intelligence, Air Force Radar Institute, Wuhan 430019, China)
Abstract:To further reduce both CPU and I/O overhead in the process of mining the association rules on the large transaction database by the traditional algorithm, an improved algorithm of association rule mining based on vertical data layout named VARMLDb(Vertical Association Rule Mining for Large Databases) was suggested. In the proposed algorithm,after dividing the database into several partitions each of that is suitable for the current memory, the algorithm combines directed acyclic graphs and diffset(difference of tidlist sets) which belongs vertical data layout structure for storing and computing frequent item sets, which not only greatly cuts down the required memory size used to save intermediate results but also solves the low efficiency problem during the mining dense database by traditional vertical data mining algorithm, so that the algorithm is more effective for large dense databases. As a result of drawing the advantages of CARMA(continuous association rule mining) algorithm, the algorithm needs to scan the database for only twice.Experimental results show that the algorithm is correct, and in the large dense transaction databases, VARMI_Db algorithm has higher implementation efficiency. Continuous association rule mining algorithm, Directed acyclic graphs, Diffset plumb, Vertically distributed data, Dense database
Keywords:Continuous association rule mining algorithm  Directed acyclic graphs  Diffset plumb  Vertically distributed data  Dense databases
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号