藏文自动分词系统中紧缩词的识别 Identification of Abbreviated Word in Tibetan Word Segmentation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

藏文自动分词系统中紧缩词的识别

引用本文：	才智杰.藏文自动分词系统中紧缩词的识别[J].中文信息学报,2009,23(1):35.

作者姓名：	才智杰

作者单位：	青海师范大学藏文智能信息处理中心,青海西宁 810008

摘要：	在藏文信息处理中,涉及句法、语义都需要以词为基本单位,句法分析、语句理解、自动文摘、自动分类和机器翻译等,都是在切词之后基于词的层面来完成各项处理。因此,藏文分词是藏文信息处理的基础。该文通过研究藏文自动分词中的紧缩词,首次提出了它的一种识别方案,即还原法,并给出了还原算法。其基本思想是利用藏文紧缩词的添接规则还原藏文原文,以达到进行分词的目的。该还原算法已应用到笔者承担的国家语委项目中。经测试,在85万字节的藏文语料中紧缩词的识别准确率达99.83%。
关键词：	计算机应用中文信息处理紧缩词藏文分词还原法格助词
Identification of Abbreviated Word in Tibetan Word Segmentation

CAI Zhi-jie.Identification of Abbreviated Word in Tibetan Word Segmentation[J].Journal of Chinese Information Processing,2009,23(1):35.

Authors:	CAI Zhi-jie

Affiliation:	Tibetan Intellectual Information Processing Centre of Qinghai Normal University, Xining, Qinghai 810008, China

Abstract:	In Tibetan information processing,the word is to be treated as the fundamental unit for parsing,the sentence comprehension,the automatic abstract,the automatic classification,the machine translation and so on,Therefore,Tibetan word segmentation is essential for Tibetan information processing.Through the analysis of abbreviated word in Tibetan,,this article proposes a new method of restoration to identify the abbreviated word for Tibetan word segmentation.The basic idea of the restoration method is to re-est...

Keywords:	computer application Chinese information processing abbreviated word Tibetan word segmentation restoration method case-auxiliary word
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏