基于网页结构挖掘的信息提取 Extracting Information by Mining Structures of Web Pages期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于网页结构挖掘的信息提取

引用本文：	李媛,耿桦,张甍,潘金贵. 基于网页结构挖掘的信息提取[J]. 计算机科学, 2006, 33(3): 191-193

作者姓名：	李媛耿桦张甍潘金贵

作者单位：	南京大学计算机软件新技术国家重点实验室,南京,210093;南京大学计算机软件新技术国家重点实验室,南京,210093;南京大学计算机软件新技术国家重点实验室,南京,210093;南京大学计算机软件新技术国家重点实验室,南京,210093

摘要：	本文提出了两种细粒度的、基于网页结构挖掘的信息提取方法,比较了它们的优缺点,并给出了相应具体实现的性能测试和结果分析.
关键词：	信息提取网页结构挖掘重复模式时间特征 RSS
Extracting Information by Mining Structures of Web Pages

LI Yuan,GENG Hua,ZHANG Meng,PAN Jin-Gui. Extracting Information by Mining Structures of Web Pages[J]. Computer Science, 2006, 33(3): 191-193

Authors:	LI Yuan GENG Hua ZHANG Meng PAN Jin-Gui

Affiliation:	State Key Laboratory for Novel Software Technology of Nanjing University, Multimedia Technology Institute of Nanjing University, Nanjing 210093

Abstract:	To simplify the task of obtaining information from the vast number of information sources that are available on the WWW, we have developed two different methods to extract information of fine grain. This paper firstly describes the principles of the two methods, which work by mining structures of Web pages, and then compares the advantages and disadvantages of them. Finally, we test the performance of the two methods and analyze the experiment results.

Keywords:	RSS
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏