基于蚂蚁算法的Deep Web页面信息抽取方法研究 Study on Deep Web Information Extraction Technology Based on Ant Algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于蚂蚁算法的Deep Web页面信息抽取方法研究

引用本文：	陈巧,施佺.基于蚂蚁算法的Deep Web页面信息抽取方法研究[J].煤炭技术,2013,32(2):176-178.

作者姓名：	陈巧施佺

作者单位：	1. 南通大学现代教育技术中心,江苏南通,226019 2. 南通大学现代教育技术中心,江苏南通226019;南通大学计算机科学与技术学院,江苏南通226019

基金项目：	南通大学2011年自然科学课题，国家自然科学基金项目

摘要：	针对煤炭监测数据的复杂多变性及Deep Web数据查询结果网页描述信息的特点,提出了一种基于蚂蚁算法和本体指导网页信息抽取的方法。首先构建基于简单本体的数据抽取系统,通过对结果页面中包含本体语义信息的数据的映像定位,结合蚂蚁算法分析信息素浓度在DOM树上的分布比较,实现数据块路径抽取规则算法及数据分割特征码的生成。以煤炭行业获取的数据进行抽取性能测试,数据实验表明,抽取算法结果具有较高的准确率。
关键词：	信息抽取本体语义蚂蚁算法
Study on Deep Web Information Extraction Technology Based on Ant Algorithm

CHEN Qiao , SHI Quan.Study on Deep Web Information Extraction Technology Based on Ant Algorithm[J].Coal Technology,2013,32(2):176-178.

Authors:	CHEN Qiao SHI Quan

Affiliation:	Quan1,2(1.Modern Educational Technology Center of Nantong University,Nantong 226619,China;2.School of Computer Science and Technology of Nantong University,Nantong 226019,China)

Abstract:	Due to the complex of the coal monitoring data,a novel approach of web page information extraction guided by ant colony algorithm is proposed.The method first builded a simple ontology-based data extraction system.By positioning the image data on the result pages,and combined with ant algorithm,it creates extraction rules.The extraction performance test data obtained by the coal industry,and the experimental results indicate that the method gives a better accuracy according to the extraction.

Keywords:	information extraction ontology semantic ant algorithm
本文献已被 CNKI 万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏