基于贝叶斯网络的二元语法中文分词模型 Bigram Chinese Word Segmentation Model Based on Bayesian Network期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于贝叶斯网络的二元语法中文分词模型

引用本文：	刘丹,方卫国,周泓.基于贝叶斯网络的二元语法中文分词模型[J].计算机工程,2010,36(1):12-14.

作者姓名：	刘丹方卫国周泓

作者单位：	北京航空航天大学经济管理学院,北京,100083

基金项目：	国家自然科学基金资助项目(70521001)

摘要：	提出基于贝叶斯网络的中文分词模型，使用性能更好的平滑算法，可同时实现交叉、组合歧义消解以及译名、人名识别。应用字齐Viterbi算法求解，在保证精度和召回率的前提下，有效提高了分词效率。实验结果显示，该模型封闭测试的精度、召回率分别为99.68%和99.7%，分词速度约为每秒74 800字。
关键词：	中文分词贝叶斯网络 Viterbi算法 N元语法
修稿时间：
Bigram Chinese Word Segmentation Model Based on Bayesian Network

LIU Dan,FANG Wei-guo,ZHOU Hong.Bigram Chinese Word Segmentation Model Based on Bayesian Network[J].Computer Engineering,2010,36(1):12-14.

Authors:	LIU Dan FANG Wei-guo ZHOU Hong

Affiliation:	(School of Economy and Management, Beihang University, Beijing 100083)

Abstract:	This paper proposes Chinese word segmentation model based on Bayesian network, which adopts better smoothing algorithm to achieves word sense disambiguation and automatic recognition of foreign/domestic person names together. Viterbi algorithm is used in the model, which is demonstrated to be more efficient in word segmentation under acceptable accuracy and recall rate. Experimental results show that precision rate is 99.68% and recall rate is 99.7% in close test, with the speed of 74 800 words per second.

Keywords:	Chinese word segmentation Bayesian network Viterbi algorithm N-gram
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏