首页 | 本学科首页   官方微博 | 高级检索  
     

中文微博语料情感类别自动标注方法
引用本文:阳爱民,周咏梅,周剑峰.中文微博语料情感类别自动标注方法[J].计算机应用,2014,34(8):2188-2191.
作者姓名:阳爱民  周咏梅  周剑峰
作者单位:1. 广东外语外贸大学 思科信息学院,广州510006 2. 广东外语外贸大学 图书馆,广州510006
基金项目:国家社会科学基金资助项目;教育部新世纪优秀人才支持计划项目
摘    要:针对大规模微博语料手动标注困难的问题,提出了中文微博语料情感类别自动标注的方法,包括基于关键词的、基于概率求和的和基于概率乘积的3种自动标注方法和一种集成标注方法。自动标注时首先分别使用3种标注方法进行标注,得到3种标注结果;然后,采用标注方法集成的策略,对3种标注的结果通过投票的方式决定最终的标注结果。通过设计自动标注实验系统进行实验,实验结果验证了所提方法的可行性和有效性。实验结果表明,单个标注方法的准确率均在70%以上,投票方法的准确率达90%以上。

收稿时间:2014-04-29
修稿时间:2014-05-09

Automatic annotation methods for Chinese micro-blog corpus with sentiment class
YANG Aiming ZHOU Yongmei ZHOU Jianfeng.Automatic annotation methods for Chinese micro-blog corpus with sentiment class[J].journal of Computer Applications,2014,34(8):2188-2191.
Authors:YANG Aiming ZHOU Yongmei ZHOU Jianfeng
Affiliation:1. Cisco School of Informatics, Guangdong University of Foreign Studies, Guangzhou Guangdong 510006, China;
2. Library, Guangdong University of Foreign Studies, Guangzhou Guangdong 510006, China
Abstract:For the difficulty of manual annotation on large-scale micro-blog corpus, three automatic annotation methods and an integrated annotation method by voting for Chinese micro-blog corpus were proposed. Three automatic annotation methods included keywords-based annotation method, probability-summation-based annotation method and probability-product-based annotation method. During the process of automatic annotation, firstly, micro-blog corpus were annotated by three annotation methods respectively, and three results were obtained, then the final annotation results were determined by voting method with the integrated strategy. By designing automatic annotation experiment system, experimental results verify the feasibility and effectiveness of the proposed methods, and show that the accuracy of the single annotation method is more than 70%, and it is more than 90% for the voting method.
Keywords:
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号