首页 | 本学科首页   官方微博 | 高级检索  
     

基于不确定数据的可能频繁闭序列模式挖掘
引用本文:李立波.基于不确定数据的可能频繁闭序列模式挖掘[J].计算机应用研究,2016,33(4).
作者姓名:李立波
作者单位:湖南大学信息科学与工程学院
基金项目:国家科技支撑计划资助项目2012BAH09B02; 长沙市重点科技计划资助项目K1204006-11-1
摘    要:; 对于不确定数据的频繁序列模式挖掘,会导致可能频繁模式数量的指数级出现,其中有些无用的挖掘结果,引起频繁序列的冗余。针对上述不足, 提出了可能频繁闭序列模式(pfcsp)的定义, 以及一种基于不确定数据的可能频繁闭序列挖掘算法U-FCSM。此算法中,基于一种元组不确定数据模型,计算序列的可能频繁性,应用BIDE算法的闭序列思想判断可能频繁序列是否是可能频繁闭序列模式。为了减少搜索空间与避免冗余的计算,应用了几个剪枝与边界技术。U-FCSM算法的有效性与效率通过大量的实验得以表明。

关 键 词:,,不确定数据,可能频繁闭序列,概率频繁,数据挖掘
收稿时间:2014/12/18 0:00:00
修稿时间:2016/2/22 0:00:00

Mining probabilistically frequent closed sequential patterns in uncertain databases
Affiliation:College of information science and engineering,Hunan University
Abstract:For frequent sequential patterns mining in uncertain data,leading an exponential number of probabili- stically frequent sequence patterns ,which contains some useless mining results, causing frequent sequence of red- undancy.Regarding to the above disadvantages ,this paper put forword a definition of probabilistically frequent cl- osed sequence(pfcsp),and proposed a mining algorithm of probability frequent closed sequence based on uncertain data called U-FCSM.This algorithm based on a tuple uncertain data model, calculated the possibility of frequent s- equences,and then according to the idea of closed sequence of BIDE algorithm principle to judge whether probabi- listically frequent sequence to probabilistically frequent closed sequence. In order to reduce the search space and avoid redundant computation, it applied several pruning and boundary techniques. Finally, extensive experiments show that the effectiveness and efficiency of the proposed methods.
Keywords:uncertain data  probabilistically frequent closed sequence  probabilistic frequent  data mining
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号