首页 | 本学科首页   官方微博 | 高级检索  
     

中文微博情感分析研究与实现
引用本文:李勇敢,周学广,孙艳,张焕国.中文微博情感分析研究与实现[J].软件学报,2017,28(12):3183-3205.
作者姓名:李勇敢  周学广  孙艳  张焕国
作者单位:武汉大学计算机学院, 武汉, 430079,海军工程大学信息安全系, 武汉, 430033,中国人民解放军 92941 部队, 辽宁葫芦岛, 125000,武汉大学计算机学院, 武汉, 430079
基金项目:国家重点基础研究发展计划973(2014CB340600);国家自然科学基金(61332019,61672531);国家社会科学基金(14GJ003-152)
摘    要:中文微博的大数据、指数传播和跨媒体等特性,决定了依托人工方式监控和处理中文微博是不现实的,迫切需要依托计算机开展中文微博情感自动分析研究.该项研究可分为3个任务:中文微博观点句识别、情感倾向性分类和情感要素抽取.为完成上述任务,我们研制了一个评测系统:通过构建多级词库、制定成词规则、开展串频统计等给出一种基于规则和统计的新词识别方法,在情感词和评价对象的依存模式的基础上给出基于词语特征的观点句识别算法;以词序流表示文本的LDA-Collocation模型,采用吉布斯抽样法推导了算法,实现中文微博情感倾向性自动分类;针对中文微博情感要素抽取的召回率较低问题,利用依存关系分析理论,按主语类和宾语类把依存模式分为2类,建立了6个优先级的评价对象和情感词汇的依存模式,通过评价对象归并算法实现计算机自动抽取情感要素.实验包括2个部分,一是参加NLPCC2012的公开评测,本文方法在微博观点句识别任务中的准确率为第2,在中文微博情感要素抽取任务中的准确率和F值均为第2,验证了本文算法的实用性.二是在分析公开评测结果的基础上,分别比较了参加公开评测的各类算法在处理中文微博情感分析时的效率,给出本文的结论.

关 键 词:中文微博  情感分析  依存分析  情感倾向性分类  情感要素抽取  无监督主题情感模型
收稿时间:2016/5/19 0:00:00
修稿时间:2016/7/4 0:00:00

Research and Implementation of Chinese Microblog Sentiment Classification
LI Yong-Gan,ZHOU Xue-Guang,SUN Yan and ZHANG Huan-Guo.Research and Implementation of Chinese Microblog Sentiment Classification[J].Journal of Software,2017,28(12):3183-3205.
Authors:LI Yong-Gan  ZHOU Xue-Guang  SUN Yan and ZHANG Huan-Guo
Affiliation:School of Computer Science, Wuhan University, Wuhan 430079, China,Department of Information Security, Navy University of Engineering, Wuhan 430033, China,Unit Number of 92941, PLA, Huludao 125000, China and School of Computer Science, Wuhan University, Wuhan 430079, China
Abstract:Aimed at the topic of the requiring background from the Internet content security, this paper studies sentiment analysis in Weibo, which includes three tasks, namely, emotion sentence identification and classification, emotion tendency classification, and emotion expression extraction. To solve the automatic label hashtags, the bottleneck of emotion tendency classification, of advantage to process massive Weibo with computers, an unsupervised topic sentiment model, UTSM, is proposed based on the LDA Collocation model. A Gibbs sampling implementation for inference of our algorithm is presented, and can be used to category the emotion tendency automatically with computer. In accordance with the lower ratio of recall for emotion expression extraction in Weibo, use dependency parsing, divided into two categories with subject and object, summarized six kinds of dependency model from evaluation objects and emotion words, and proposed merging algorithm for evaluation objects can be accurately extracted evaluated by participating in the public bakeoff and in the shared tasks among the best methods in sub-task of emotion expression extraction, indicating that our method has a strong innovative and practical value.
Keywords:Chinese Microblog  Sentiment analysis  dependency parsing  emotion tendency classification  emotion expression extraction  unsupervised topic sentiment model
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号