首页 | 本学科首页   官方微博 | 高级检索  
     

中文博客多方面话题情感分析研究
引用本文:傅向华,刘国,郭岩岩,郭武彪.中文博客多方面话题情感分析研究[J].中文信息学报,2013,27(1):47-56.
作者姓名:傅向华  刘国  郭岩岩  郭武彪
作者单位:深圳大学 计算机与软件学院,广东 深圳 518060
基金项目:国家自然科学基金资助项目,广东省自然科学基金资助项目,深圳市科技计划资助项目
摘    要:博客是Web环境中个人表达观点和情感的一种重要载体,一般涉及较宽泛的话题,蕴含丰富的舆情信息。现有针对有关社会事件的用户产生内容进行情感分析的研究多数以篇章级为处理粒度,尚不能满足博客文本深度情感分析的需求。该文提出一种基于LDA话题模型与Hownet词典的中文博客多方面话题情感分析方法。该方法首先利用数据语料训练LDA话题模型,然后以滑动窗口为基本处理单位,利用训练好的LDA模型对博客文本进行话题识别与划分;在此基础上,基于Hownet词典对划分后的话题段落进行情感倾向计算。该方法有助于同时识别博客文本所涉及的多方面子话题及每个子话题上的情感倾向。实验结果表明,该方法不仅能获得较好的话题划分结果,也有助于改善情感分析的准确率。

关 键 词:多方面情感分析  博客情感分析  LDA模型  HowNet词典  

Multi-aspect Topic Sentiment Analysis of Chinese Blog
FU Xianghua , LIU Guo , GUO Yanyan , GUO Wubiao.Multi-aspect Topic Sentiment Analysis of Chinese Blog[J].Journal of Chinese Information Processing,2013,27(1):47-56.
Authors:FU Xianghua  LIU Guo  GUO Yanyan  GUO Wubiao
Affiliation:College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong 518060, China
Abstract:Weblog is an important media for people to express their personal opinions and sentiment, which generally involve several topics or implied public opinions. The existing sentiment analysis researches on these user generation content are mostly in document level instead of fine granalarities. This paper proposes a novel method based on LDA topic model and HowNet lexicon to determine the sentiment orientation of blogs with multi-aspect topics. The new method utilizes data corpus to train the LDA topic model at first. Then it identifies and segments topics with the trained topic model, which taking a slide window as the basic processing unit. After that, the topics of paragraphs can be identified. And then the method conducts the sentiment analysis on topic paragraphs with HowNet lexicon. The new method can help to simultaneous identify multi-aspect topics and the sentiment orientation of these topics. The experiment results show that this approach can not only obtain a good topic partitioning results, but also help to improve sentiment analysis accuracy.
Key wordsmulti-aspect sentiment analysis; blog sentiment analysis; LDA topic model; HowNet lexicon
Keywords:multi-aspect sentiment analysis  blog sentiment analysis  LDA topic model  HowNet lexicon  
本文献已被 万方数据 等数据库收录!
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号