基于正则表达式的企业主页信息抽取① Enterprise Homepage Information Extraction Based on Regular Expression期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于正则表达式的企业主页信息抽取①

引用本文：	靳小川,刘万军,赵雷. 基于正则表达式的企业主页信息抽取①[J]. 计算机系统应用, 2010, 19(8): 70-73

作者姓名：	靳小川刘万军赵雷

作者单位：	1. 辽宁工程技术大学软件学院,辽宁,葫芦岛,125105 2. 沈阳师范大学计算机与数学基础教学部,辽宁,沈阳,110034

摘要：	主要分析了企业主页上描述企业基本信息表达语句的结构特点,提出了基于正则表达式的企业主页信息抽取的方法和技术,并设计开发了一个相应的原型系统对一些企业信息项进行抽取。实验结果表明,该系统可以有效地从企业主页上抽取企业相关信息,并得到较高的抽全率和抽准率。
关键词：	企业主页正则表达式信息抽取
收稿时间：	2009-11-13
修稿时间：	2009-12-20
Enterprise Homepage Information Extraction Based on Regular Expression

JIN Xiao-Chuan,LIU Wang-Jun and ZHAO Lei. Enterprise Homepage Information Extraction Based on Regular Expression[J]. Computer Systems& Applications, 2010, 19(8): 70-73

Authors:	JIN Xiao-Chuan LIU Wang-Jun ZHAO Lei

Affiliation:	1.Software College,Liaoning Technical University,Huludao 125105,China;2.Computer and Math College,Shenyang Normal University,Shenyang 110034,China)

Abstract:	The paper mainly analyses the structural characteristic of the sentences that describe enterprise basic information on enterprise homepage. It proposes the method and technique of enterprise homepage information extraction based on regular expression, and develops an archetype system to extract some enterprise information items. The experimental results show that it can extract enterprise-related information from enterprise homepage effectively and get a high recall and precision.

Keywords:	enterprise homepage regular expression information extraction
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机系统应用》浏览原始摘要信息
	点击此处可从《计算机系统应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏