中文分词及词性标注一体化模型研究 Research on the Model of Integrating Chinese Word Segmentation with Part-of-speech Tagging期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

中文分词及词性标注一体化模型研究

引用本文：	佟晓筠,宋国龙,刘强,张俐,姜伟. 中文分词及词性标注一体化模型研究[J]. 计算机科学, 2007, 34(9): 174-175

作者姓名：	佟晓筠宋国龙刘强张俐姜伟

作者单位：	哈尔滨工业大学(威海)计算机科学与技术学院,威海,264209;东北大学信息科学与工程学院,沈阳,110004;辽东学院计算中心,丹东,118000

摘要：	本文应用N-最短路径法，构造了一种中文自动分词和词性自动标注一体化处理的模型，在分词阶段召回N个最佳结果作为候选集，最终的结果会在未登录词识别和词性标注之后，从这N个最有潜力的候选结果中选优得到，并基于该模型实现了一个中文自动分词和词性自动标注一体化处理的中文词法分析器。初步的开放测试证明，该分析器的分词准确率和词性标注准确率分别达到98．1％和95．07％。
关键词：	中文分词词性标注 N-最短路径法
Research on the Model of Integrating Chinese Word Segmentation with Part-of-speech Tagging

TONG Xiao-Jun,SONG Guo-Long,LIU Qiang,ZHANG Li,JIANG Wei. Research on the Model of Integrating Chinese Word Segmentation with Part-of-speech Tagging[J]. Computer Science, 2007, 34(9): 174-175

Authors:	TONG Xiao-Jun SONG Guo-Long LIU Qiang ZHANG Li JIANG Wei

Abstract:	In this paper, we present a model integrating Chinese word segment with part-of-speech tagging. In the early stage, reserves the top N segmentation results as candidates. After Unknown words recognized and POS tagging finished, we get the final result by select form the top N segmentation candidates. We also develop a Chinese lexical analyzer based on this model. The primary experiment proved that the overall accuracy of the proposed analyzer is 98. 1 for segmentation and 95.7% for POS tagging respectively.

Keywords:	Chinese word segmentation Part-of-speech tagging N-shortest paths method
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏