Web数据中频繁模式树的挖掘 Mining frequent pattern tree in Web data期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Web数据中频繁模式树的挖掘

引用本文：	王自强,冯博琴. Web数据中频繁模式树的挖掘[J]. 控制理论与应用, 2005, 22(3): 429-433

作者姓名：	王自强冯博琴

作者单位：	西安交通大学,计算机科学系,陕西,西安,710049;西安交通大学,计算机科学系,陕西,西安,710049

基金项目：	国家"八六三"高技术研究发展计划基金资助项目(2003AA1Z2610).

摘要：	为了高效地从半结构化WEB数据中挖掘频繁模式树,提出了把半结构化数据表示为标记、有序树,并基于最右路径扩展技术在有序树中发现所有频繁模式树的算法.其基本思想是,首先从只有一个节点的模式树开始,而新增节点只能通过添加到最右路径上来生成新的模式树,另外,还通过维护最右叶子出现次数列表来实现支持度的逐步计算.理论分析和试验结果表明该算法是可行的,并且具有计算性能线性于最大频繁模式总和的优点.
关键词：	数据挖掘 Web数据频繁模式树有序树
文章编号：	1000-8152(2005)03-0429-05
收稿时间：	2003-09-26
修稿时间：	2004-06-07
Mining frequent pattern tree in Web data

WANG Zi-qiang,FENG Bo-qin. Mining frequent pattern tree in Web data[J]. Control Theory & Applications, 2005, 22(3): 429-433

Authors:	WANG Zi-qiang FENG Bo-qin

Affiliation:	Department of Computer Science,Xi'an Jiaotong University,Xi'an Shaanxi 710049,China

Abstract:	To efficiently mine all frequent pattern trees from the semi-structured web data,the semi-structured data were modeled as labeled-ordered tree and an algorithm for mining all frequent pattern trees in an ordered data tree was proposed.This algorithm used rightmost path expansion technique,which started with pattern trees with only one node and nodes were added only to the rightmost path to generate new pattern trees.Furthermore,this algorithm maintained only the occurrences of the rightmost leaves to efficiently implement incremental computation of support.The theoretical analysis and experimental results show that this algorithm scales linearly in the total size of maximal tree pattern and works efficiently in practice.

Keywords:	data mining Web data frequent pattern tree ordered tree
本文献已被维普万方数据等数据库收录！
	点击此处可从《控制理论与应用》浏览原始摘要信息
	点击此处可从《控制理论与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏