首页 | 本学科首页   官方微博 | 高级检索  
     

基于最右扩展枚举的半结构化数据最大模式挖掘方法研究
引用本文:吴共庆,陈恩红,王舒,王煦法. 基于最右扩展枚举的半结构化数据最大模式挖掘方法研究[J]. 小型微型计算机系统, 2004, 25(9): 1696-1699
作者姓名:吴共庆  陈恩红  王舒  王煦法
作者单位:中国科学技术大学,计算机系,安徽,合肥,230027
基金项目:国家自然科学基金(60005004)资助;安徽省自然科学基金(01042302)资助.
摘    要:本文以标记有序树作为半结构化数据的数据模型 ,研究了半结构化数据的树状最大频繁模式挖掘问题 .已有挖掘算法通常挖掘所有频繁模式 ,其中很多模式为其它模式的子模式 ,针对该问题 ,设计实现了一种最大模式挖掘算法 .该算法采用最右扩展枚举方法无重复枚举所有候选模式 ,利用频繁模式扩展森林实现高效剪枝扩展和挖掘频繁叶模式 ,通过计算频繁叶模式间的包含关系挖掘树状最大频繁模式 .试验结果表明该算法具有良好性能

关 键 词:半结构化数据  标记有序树  最右扩展枚举  树状最大频繁模式  模式挖掘
文章编号:1000-1220(2004)09-1696-04

Maximum Frequent Pattern Mining from Semi-Structured Data Based on Rightmost Expansion Technique
WU Gong qing,CHEN En hong,WANG Shu,WANG Xu fa. Maximum Frequent Pattern Mining from Semi-Structured Data Based on Rightmost Expansion Technique[J]. Mini-micro Systems, 2004, 25(9): 1696-1699
Authors:WU Gong qing  CHEN En hong  WANG Shu  WANG Xu fa
Abstract:Tree structured pattern mining is an important issue in semi structured data mining. In this paper, labeled ordered tree is used as the data model of semi structured data,the problem of maximum tree structured frequent pattern mining from semi structured data is studied. We observe that patterns mined by existing algorithms contain all the frequent patterns, but many of which are subpatterns of other patterns. To solve the problem, the paper proposes a maximum frequent pattern mining algorithm to find the maximum frequent patternson the data source. To improve the mining efficiency, our algorithm aopts several techniques. First the paperadopts the rightmost expansion technique to enumerate all the candidates without duplication. Second the paper develops and efficient pruning technique and a Tree structured Frequent Leaf Pattern mining techqique by using the frequent pattern expansion forest obtained in mining processes. Third the paper utilizes the containing relationship between Tree structured Frequent Leaf Pattern to get the maximum frequent patterns. The experiment shows that our algorithm can achieve satisfactory efficiency of maximum pattern mining.
Keywords:semi-structrued data  labeled ordered tree  rightmost expansion  maximum tree-structured frequent pattern  pattern mining
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号