首页 | 本学科首页   官方微博 | 高级检索  
     

基于多层次语言特征的弱监督评论倾向性分析
引用本文:牛 耘,张 黎,王世泓,魏 欧.基于多层次语言特征的弱监督评论倾向性分析[J].中文信息学报,2015,29(4):80-88.
作者姓名:牛 耘  张 黎  王世泓  魏 欧
作者单位:南京航空航天大学 计算机科学与技术学院,江苏 南京 210016
基金项目:国家自然科学基金(61202132);教育部高等学校博士学科点专项基金(20103218120024);中央高校基本科研业务费专项资金(NS2012073)
摘    要:该文提出一种基于多层次语言特征的弱监督的情感分析方法, 先以少量情感词构成初始情感词典,用这些种子词汇作引导,根据评论文本在单词、短语及句子级别的语言特征结合上下文挖掘目标文本中潜在的具有情感倾向的词汇/短语。通过自训练不断扩充情感词典,最终得到一个具有领域特征的情感词典,并用所得到的情感词典对目标文本的情感倾向进行判断。与其他方法在同一数据上的结果相比,该方法以很小的词典规模取得了最高的F-score,并且得到的情感词含义明确。方法用于不同领域也取得了较高的精度,表明方法具有较好的领域适应性。

关 键 词:情感分析  多层次语言特征  弱监督算法  情感词典  

Weakly Supervised Sentiment Analysis Based on Multi-level Linguistic Features
NIU Yun,ZHANG Li,WANG Shihong,WEI Ou.Weakly Supervised Sentiment Analysis Based on Multi-level Linguistic Features[J].Journal of Chinese Information Processing,2015,29(4):80-88.
Authors:NIU Yun  ZHANG Li  WANG Shihong  WEI Ou
Affiliation:School of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics,Nanjing, Jiangsu 210016, China
Abstract:In this paper, a weakly supervised sentiment analysis approach is proposed. A few words are collected to construct an initial sentiment lexicon. These seed words are used to mine potential sentimental words in the target text. In this process, linguistic features at multi-levels are explored and the role of the context is examined. The lexicon is expanded iteratively, and the final version is applied to classify the sentiment of a target document. Compared to results of previous studies on the same data, this approach achieves the best F-score while the constructed sentiment lexicon is rather small. The experimental results also show that this approach is robust when applied to a texts of different domains.
Keywords:sentiment analysis  linguistic features  weakly-supervised method  sentiment lexicon  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号