现代维吾尔语常用词统计关键技术研究 Research on Key Technology for Statistics of Modern Uyghur Language期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

现代维吾尔语常用词统计关键技术研究

引用本文：	艾孜尔古丽,努尔艾合买提,玉素甫·艾白都拉.现代维吾尔语常用词统计关键技术研究[J].中文信息学报,2014,28(5):192-197.

作者姓名：	艾孜尔古丽努尔艾合买提玉素甫·艾白都拉

作者单位：	新疆师范大学计算机科学与技术学院新疆乌鲁木齐 830054

基金项目：	新疆维吾尔自治区自然科学基金(2014211A045);教育部人文社会科学一般项目(14YJC740001);新疆维吾尔自治区高校科研计划青年教师科研启动基金(20140706213103147);国家自然科学基金(61132009);国家自然科学基金项目(61262066);国家语委“十二五”科研规划项目(YB125-45)。

摘要：	本文研究了构建现代维吾尔语语料库的关键技术与方法,特别是现代维吾尔语语料库的构建,并对现代维吾尔语语料预处理技术,现代维吾尔语语料统计技术,现代维吾尔语词干提取技术,现代维吾尔语数据分析技术进行了研究;研制了现代维吾尔语常用词候选表, 从词语的使用频度和词语的分布两方面对词语进行了基本考察,将维吾尔语词语的“词种数、频次、频率、文本数、词长”作为常用词候选表的依据。
关键词：	现代维吾尔语语料库常用词候选表计量分析
Research on Key Technology for Statistics of Modern Uyghur Language

Azragul,Nurahmat,Yusup Abaydula.Research on Key Technology for Statistics of Modern Uyghur Language[J].Journal of Chinese Information Processing,2014,28(5):192-197.

Authors:	Azragul Nurahmat Yusup Abaydula

Affiliation:	1. School of Computer Science and technology,Xinjiang Normal University,Urumqi Xinjiang,830054,China

Abstract:	This paper studies key technologies for the modern Uyghur language corpus construction, in particular the collection of modern Uyghur language corpus, and the pre-processing of modern Uyghur corpus, the statistical technique in modern Uyghur corpus, the stemming of modern Uyghur and the analysis of modern Uyghur data. To develope a candidate list for modern Uyghur common words, this paper examines the words in two aspects: the frequency and distribution, specifically including the word species, frequency , frequency rate, document coverage word length.

Keywords:	modern Uyghur language corpus common words lexicon quantitative analysis
本文献已被 CNKI 等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏