哈萨克语基本名词短语自动识别研究与实现 Research and Implementation of Kazakh Base Noun Phrase Identification期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

哈萨克语基本名词短语自动识别研究与实现

引用本文：	孙瑞娜,古丽拉·阿东别克.哈萨克语基本名词短语自动识别研究与实现[J].中文信息学报,2010,24(6):114-120.

作者姓名：	孙瑞娜古丽拉·阿东别克

作者单位：	新疆大学信息科学与工程学院,新疆乌鲁木齐 830046

基金项目：	国家自然科学基金资助项目，国家教育部、国家语委民族语言文字规范标准建设及信息化科研项目

摘要：	以哈萨克语基本名词短语识别为目标,实现了哈萨克语基本名词短语自动识别系统。采用基于规则自动识别及人工标注的方法建立基本名词短语标注语料库,在此基础上,采用统计和规则相结合的识别方法,利用互信息进行基本名词短语边界预测,然后根据哈萨克语基本名词短语构成规则对预测边界进行调整,加入标注符,得到最终的识别结果。实验结果表明,两种方法封闭测试的识别精确率分别为80.2%和82.5%。
关键词：	语料库基本名词短语哈萨克语互信息规则
Research and Implementation of Kazakh Base Noun Phrase Identification

SUN Ruina,Gulila·Altenbek.Research and Implementation of Kazakh Base Noun Phrase Identification[J].Journal of Chinese Information Processing,2010,24(6):114-120.

Authors:	SUN Ruina Gulila·Altenbek

Affiliation:	Department of Information Science and Engineering College, Xinjiang University, Urumqi, Xinjiang 830046, China

Abstract:	An automatic identification system for Kazakh basic noun phrase is presented. Adopting the rule based identification method and manual annotation, the corpus of Kazakh base noun phrase is first constructed. Then, a combined approach using statistical information and linguistics rules is presented to predict the base noun phrase boundary by mutual information and correct the boundary by base noun phrase constitution rules. Experiment shows the precision is improved from 80.2% to 82.5% by combining the rules. Key wordscorpus; base noun phrase; Kazakh; mutual information; rules

Keywords:	corpus base noun phrase Kazakh mutual information rules
本文献已被万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏