基于统计的中文地名识别 Identification of Chinese Place Names Based on Statistics期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于统计的中文地名识别

引用本文：	黄德根,岳广玲,杨元生.基于统计的中文地名识别[J].中文信息学报,2003,17(2):37-42.

作者姓名：	黄德根岳广玲杨元生

作者单位：	大连理工大学计算机科学与工程系

基金项目：	国家自然科学基金资助项目 (6 0 14 30 0 2 )

摘要：	本文针对有特征词的中文地名识别进行了研究。该系统使用从大规模地名词典和真实文本语料库得到的统计信息以及针对地名特点总结出来的规则,通过计算地名的构词可信度和接续可信度从而识别中文地名。该模型对自动分词的切分作了有效的调整,系统闭式召回率和精确率分别为90.24%和93.14% ,开式召回率和精确率分别达86.86%和91.48%。
关键词：	计算机应用中文信息处理中文地名识别构词可信度接续可信度自动分词
文章编号：	1003-0077(2003)02-0036-06
修稿时间：	2002年7月22日
Identification of Chinese Place Names Based on Statistics

HUANG De,gen,YUE Guang,ling,YANG Yuan,sheng.Identification of Chinese Place Names Based on Statistics[J].Journal of Chinese Information Processing,2003,17(2):37-42.

Authors:	HUANG De gen YUE Guang ling YANG Yuan sheng

Affiliation:	Department of Computer Science and Engineering ,Dalian University of Technology

Abstract:	Unknown word recognition is one of the challenging tasks in natural language processing research.This paper proposes a place name identification model in dictionary based Chinese word segmentation,in which we used statistical information drawn from a training corpus to calculate lexical reliability and contextual reliability.The rules of Chinese place names are also used in the model.We applied this approach to a Chinese morphological analysis system,and achieved 90.24% recall and 93 14% precision in close test,while the recall and precision also reach 86 86% and 91 48% in open test.

Keywords:	computer application Chinese information processing Chinese Place Name Identification Lexical Reliability Contextual Reliability Automatic word segmentation
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏