首页 | 本学科首页   官方微博 | 高级检索  
     

基于多线索混合词典的微博情绪识别
引用本文:潘明慧,牛耘.基于多线索混合词典的微博情绪识别[J].微机发展,2014(9):28-32.
作者姓名:潘明慧  牛耘
作者单位:南京航空航天大学计算机科学与技术学院,江苏南京210016
基金项目:基金项目:国家自然科学基金青年科学基金项目(61202132);教育部高等学校博士学科点专项基金资助项目(20103218120024);校青年科创基金(NS2012073)
摘    要:微博等社交媒体为人们情绪表达提供了重要平台,分析微博的情绪倾向具有重要的商业价值和社会意义。文中提出了基于词典的规则方法识别微博所表达的喜、哀、怒、惧、恶、惊六种情绪。针对情绪表达的重要线索表情符利用互信息法生成了表情符词典,与传统情绪词典相结合,制定了针对否定用法的规则对微博进行分析。建立了第一个包含六种情绪的人工标注微博数据集。实验表明,传统的情绪词典虽然收录了大量词汇,但对于社交媒体文本分析的准确率和覆盖率都不高。表情符词典的应用显著地提高了微博情绪分析的精度和覆盖率。

关 键 词:微博  情绪分析  情绪词典  表情符

Emotion Recognition of Micro-blogs Based on a Hybrid Lexicon
PAN Ming-hui,NIU Yun.Emotion Recognition of Micro-blogs Based on a Hybrid Lexicon[J].Microcomputer Development,2014(9):28-32.
Authors:PAN Ming-hui  NIU Yun
Affiliation:(School of Computer Science and Technology ,Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China)
Abstract:The proliferation of micro-blogs has created a popular digital platform where people are able to express emotions and share feelings. Analysis of emotions in micro-blogs would be potentially beneficial to companies and the society. In this paper, a lexicon-based approach is proposed to identify six emotions in micro-blog text, including joy, sadness, anger, fear, disgust and surprise. A lexicon of emoticons is built based on the mutual information method between emoticons and emotions. Combined with a traditional emotion lexicon in this approach, negation rules are made to process negations in emotion expression to analyze mirco-blog. The first corpus of Chinese micro-blogs manually annotated with the six emotions is built as the test set. The experimental results show that the traditional lexicon has a moderate accuracy and coverage in analysis of micro-blog text. The combination of the two lexicons greatly improves the accuracy and coverage.
Keywords:micro-blog  emotion analysis  emotion lexicon  emoticons
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号