首页 | 本学科首页   官方微博 | 高级检索  
     

基于树及索引的HTML表格数据挖掘算法研究
引用本文:程晓伟,田东风.基于树及索引的HTML表格数据挖掘算法研究[J].数字社区&智能家居,2009(10).
作者姓名:程晓伟  田东风
作者单位:中国地质大学;
摘    要:提出了一种基于树及索引结构的HTML解析与表格数据抽取的算法,并对各子算法复杂性进行了讨论,对HTML标签存贮模型及表格数据挖掘模型进行了详细的说明,对算法所涉及的二叉树、栈、容器、递归等算法及数据结构作了清晰阐述。

关 键 词:HTML解析器  数据挖掘  HTML标签存储  表格数据抽取  

Research of Algorithm of Table Data Digging from HTML Based on Tree and Index
CHENG Xiao-wei,TIAN Dong-feng.Research of Algorithm of Table Data Digging from HTML Based on Tree and Index[J].Digital Community & Smart Home,2009(10).
Authors:CHENG Xiao-wei  TIAN Dong-feng
Affiliation:China University of Geoscience;Beijing 100083;China
Abstract:This paper brings forward an algorithm based on tree and index,which is to analyse HTML and dig data of table.The complex-ity of this algorithm is discussed.The model of storing labels of HTML and that of digging table data are detailed.The data structures such as bi-tree,stack,vector and so on mentioned in the algorithm are showed.
Keywords:HTML analyse  data digging  HTML label store  data drag of table  
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号