首页 | 本学科首页   官方微博 | 高级检索  
     

维吾尔文情感语料库的构建与分析
引用本文:伊尔夏提·吐尔贡,吾守尔·斯拉木,热西旦木·吐尔洪太,于 清. 维吾尔文情感语料库的构建与分析[J]. 计算机与现代化, 2017, 0(4): 67. DOI: 10.3969/j.issn.1006-2475.2017.04.014
作者姓名:伊尔夏提·吐尔贡  吾守尔·斯拉木  热西旦木·吐尔洪太  于 清
基金项目:国家重点基础研究发展计划项目(2014CB340506)
摘    要:针对维吾尔文情感语料库标注体系不规范、语料库规模小、没有合适的标注平台等问题,分析英文和中文比较著名情感语料库的优点,结合维吾尔语文本的特点,建立维吾尔文情感语料标注规范,利用Python语言构建集数据采集与标注为一体的情感标注平台,最后构建在舆情分析和舆情监控中可以应用的维吾尔文情感语料库。实验结果表明,该标注规范具有可扩展性和实用性,标注平台可以有效地减轻标注人员的工作量,提高情感语料库的质量,情感语料库可以用于舆情分析任务。

关 键 词:计算机应用   自然语言处理   情感分析  维吾尔文  情感语料库  
收稿时间:2017-05-08

Construction and Analysis of Uighur Emotional Corpus
TUERGONG,Wushouer SILAMU,Rexidan TUSERHONGTAI,YU Qing. Construction and Analysis of Uighur Emotional Corpus[J]. Computer and Modernization, 2017, 0(4): 67. DOI: 10.3969/j.issn.1006-2475.2017.04.014
Authors:TUERGONG  Wushouer SILAMU  Rexidan TUSERHONGTAI  YU Qing
Abstract:For the problems of lacking standardization on criterion of Uighur sentiment corpus, small scale corpus, and no suitable tagging system, we built a tagging criterion for Uighur sentiment corpus by analyzing the advantages of famous sentiment corpuses in English and Chinese and combining the characteristics of Uighur text. We also developed a tagging system which can collect data from the Internet using Python language and built a Uighur sentiment corpus. The corpus can be used in the analysis of public opinion. Experimental results show that the tagging criterion is of expandability and practicability, the tagging system can effectively reduce the workload and improve the quality of sentiment corpus, and the sentiment corpus can be used for the public opinion analysis task.
Keywords:computer application  natural language processing  sentiment analysis  Uighur  sentiment corpus  
点击此处可从《计算机与现代化》浏览原始摘要信息
点击此处可从《计算机与现代化》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号