一种基于未知结构网页抽取本体的方法 Method for Ontology Extraction Based on Unknown Structure Web期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种基于未知结构网页抽取本体的方法

引用本文：	强宇,胡运发.一种基于未知结构网页抽取本体的方法[J].计算机科学,2009,36(2):186-189.

作者姓名：	强宇胡运发

作者单位：	1. 复旦大学计算机与信息技术系,上海200433;蚌埠坦克学院计算机室,蚌埠233013 2. 蚌埠坦克学院计算机室,蚌埠,233013

摘要：	在Web上数据大多是结构化的,但事先并不熟知数据的结构,因此不能有效地查询感兴趣的数据.提出了一种独立于文本抽取本体的方法,其过程包括表的理解、数据集成和本体生成,其中表理解是搜寻定位兴趣表、识别及匹配属性和值,并形成记录;数据集成是匹配源记录和目标模式;本体卷积是将源记录的数据抽取到目标模式.结果表明这种方法可以通过已知的目标模式有效地抽取未知结构的数据.
关键词：	异质数据集成语义对应表理解本体抽取
收稿时间：	3/4/2008 12:00:00 AM
Method for Ontology Extraction Based on Unknown Structure Web

QIANG Yu,HU Yun-fa.Method for Ontology Extraction Based on Unknown Structure Web[J].Computer Science,2009,36(2):186-189.

Authors:	QIANG Yu HU Yun-fa

Affiliation:	Department of Computer and Information Technology;Fudan University;Shanghai 200433;China;Research Room of Computer Science and Technology;Tank College;Bengbu 233013;China

Abstract:	To the user,the structure of the data in HTML tables on the Web is usually unknown,thus,the data of in-terest can't be queried directly.We presented a solution to this problem.The solution entails the understand of table element,data integration and wrapper creation.Table unstanding is to find interest table,recognize attribute and value in the table,pair attributes with values and form records.Data integration is to match source records with a target schema.Ontology specified wrappers is to extract the dat...

Keywords:	Hetero-data integration Semantic correspondence Table understanding Ontology extraction
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏