首页 | 本学科首页   官方微博 | 高级检索  
     

轻型评论的情感分析研究
引用本文:张林,钱冠群,樊卫国,华琨,张莉.轻型评论的情感分析研究[J].软件学报,2014,25(12):2790-2807.
作者姓名:张林  钱冠群  樊卫国  华琨  张莉
作者单位:1. 北京航空航天大学计算机学院,北京 100191; 浙江财经大学信息学院,浙江杭州 310018
2. 百度公司,北京,100085
3. Department of Information Systems,Pamplin College of Business,Virginia Technological University,USA
4. Electrical and Computer Engineering Department,Lawrence Technological University,USA
5. 北京航空航天大学计算机学院,北京,100191
摘    要:以在智能移动设备上发表的用户评论作为研究对象,并将该类评论称为轻型评论。指出了轻型评论与早期互联网评论及短文本研究的异同点,并通过实验总结轻型评论的独有特性:字数少、跨度大,短小评论数量众多,评论长度与数量满足幂率分布。同时,针对轻型评论的情感分类研究展开了一系列的实验研究,发现:(1)情感分类效果随着评论长度的增加而下降;(2)传统的特征筛选方法以及特征加权方法对于轻型评论效果都不够理想;(3)极性词在短评论中比例高于长评论;(4)长、短评论在用词上存在较高的重叠度。在此基础上,提出了一种基于短评论特征共现的特征筛选方法,将短小评论中的优势信息和传统的特征筛选方法相结合,在筛选掉无用噪音的同时增补有利于分类的有效特征。实验结果表明,该方法可以有效地提高轻型评论中较长评论的分类效果。

关 键 词:情感分析  用户评论  短文本  意见挖掘
收稿时间:5/5/2014 12:00:00 AM
修稿时间:2014/8/21 0:00:00

Sentiment Analysis Based on Light Reviews
ZHANG Lin,QIAN Guan-Qun,FAN Wei-Guo,HUA Kun and ZHANG Li.Sentiment Analysis Based on Light Reviews[J].Journal of Software,2014,25(12):2790-2807.
Authors:ZHANG Lin  QIAN Guan-Qun  FAN Wei-Guo  HUA Kun and ZHANG Li
Affiliation:School of Computer Science and Engineering, BeiHang University, Beijing 100191, China;Department of Information Technology, Zhejiang University of Finance & Economic, Hangzhou 310018, China;Baidu Inc., Beijing 100085, China;Department of Information Systems, Pamplin College of Business, Virginia Technological University, USA;Electrical and Computer Engineering Department, Lawrence Technological University, USA;School of Computer Science and Engineering, BeiHang University, Beijing 100191, China
Abstract:This paper researches the newly emerging user reviews (referred here as "light reviews") generated from smart mobile devices. The similarities and differences between this research and the early studies are pointed out. The unique characteristics of the light review can be summarized as having shorter texts, bigger span, and in most cases fewer words per review. The review length and scale also meet the power-law distribution. A series of experiments are studies based on light reviews, resulting in some interesting findings: (1) There is an inverse relationship between classification accuracy and review length; (2) The traditional classical feature selection and feature weight method do not perform well enough on light reviews; (3) The polar word ratio in short reviews, which is the most important feature in sentiment analysis, is higher than in long reviews; (4) There is a higher shared feature term proportion between short review and long review. Based on above studies, the paper puts forward a feature selection method based on short text co-occurrence feature. By combining the information advantages in short reviews with the traditional feature selection methods, the presented method preserves useful information and details as much as possible while removing noise. The results of experiment show that the method is effective and the classification rate is higher.
Keywords:sentiment analysis  user review  short-text  opinion mining
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号