首页 | 本学科首页   官方微博 | 高级检索  
     

ESPM--频繁子树挖掘算法
引用本文:朱永泰,王晨,洪铭胜,汪卫,施伯乐.ESPM--频繁子树挖掘算法[J].计算机研究与发展,2004,41(10):1720-1727.
作者姓名:朱永泰  王晨  洪铭胜  汪卫  施伯乐
作者单位:复旦大学计算机与信息技术系,上海,200433
基金项目:国家自然科学基金重点项目(69933010,60303008);国家"八六三"高技术研究发展计划基金项目(2002AA4Z3430,2002AA231041)
摘    要:随着互联网的发展,频繁模式的挖掘由频繁项集扩展到结构化数据:树和图.在这些结构上的挖掘工作被应用于更为复杂的领域,比如生物信息学、网络日志和XML文档.提出了一个新颖的算法:ESPM,以挖掘有序标号树中的频繁子树.不同于以往的工作,把树同构的判断工作放到了算法的晚期,从而减少了整个挖掘过程的时间开销.人工数据集和真实数据集上的实验都证明ESPM相较于其他算法的优越性.还提出了一些可能的改进.

关 键 词:数据挖掘  频繁模式  频繁子树  ESPM

ESPM-An Algorithm to Mine Frequent Subtrees
ZHU Yong Tai,WANG Chen,HONG Ming Sheng,WANG Wei,and SHI Bai Le.ESPM-An Algorithm to Mine Frequent Subtrees[J].Journal of Computer Research and Development,2004,41(10):1720-1727.
Authors:ZHU Yong Tai  WANG Chen  HONG Ming Sheng  WANG Wei  and SHI Bai Le
Abstract:With the development of Internet, frequent pattern mining generalizes to more complex patterns like tree mining and graph mining Such applications arise in complex domains like Bioinformatics, web mining, etc In this paper a novel algorithm, named ESPM (expanded subtree pattern miner), is presented to discover frequent subtrees from ordered labeled trees Unlike previous works, the work of distinguishing isomorphism is left in the later part of the algorithm, which minimizes the cost of the whole process The performance of the algorithm is evaluated with experiments on synthetic and real datasets The experimental result shows that the algorithm can do the job well and is better than previous algorithms Finally the potential improvement of ESPM is mentioned
Keywords:data mining  frequent pattern  frequent subtree  ESPM  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号