首页 | 本学科首页   官方微博 | 高级检索  
     

分段卷积神经网络在文本情感分析中的应用
引用本文:杜昌顺,黄磊. 分段卷积神经网络在文本情感分析中的应用[J]. 计算机工程与科学, 2017, 39(1): 173-179
作者姓名:杜昌顺  黄磊
作者单位:;1.北京交通大学经济管理学院
摘    要:文本情感分析是当前网络舆情分析、产品评价、数据挖掘等领域的重要任务。由于当前网络数据的急剧增长,依靠人工设计特征或者传统的自然语言处理语法分析工具等进行分析,不但准确率不高而且费时费力。而传统的卷积神经网络模型均未考虑句子的结构信息,并且在训练时很容易发生过拟合。针对这两方面的不足,使用基于深度学习的卷积神经网络模型分析文本的情感倾向,采用分段池化的策略将句子结构考虑进来,分段提取句子不同结构的主要特征;并且引入Dropout算法以避免模型的过拟合和提升泛化能力。实验结果表明,分段池化策略和Dropout算法均有助于提升模型的性能,所提方法在中文酒店评价数据集上达到了91%的分类准确率,在斯坦福英文情感树库数据集五分类任务上达到了45.9%的准确率,较基线模型都有显著的提升。

关 键 词:情感分析  深度学习  卷积神经网络  分段池化  Dropout算法
收稿时间:2016-05-06
修稿时间:2017-01-25

Sentiment analysis with piecewise convolution neural network
DU Chang shun,HUANG Lei. Sentiment analysis with piecewise convolution neural network[J]. Computer Engineering & Science, 2017, 39(1): 173-179
Authors:DU Chang shun  HUANG Lei
Affiliation:(School of Economics and Management,Beijing Jiaotong University,Beijing 100044,China)
Abstract:Text sentiment analysis is an important task in the field of network public opinion analysis, product evaluation and data mining. With the growth of data volume, the traditional methods such as manual engineering and NLP tools cannot handle the task due to their low accuracy and high costs. Therefore, we propose a deep learning method named convolution neural network (CNN) to deal with it. The traditional CNN does not consider the structural information of sentences and suffers from overfitting. Aiming at the two problems, we first design a piecewise convolution neural network (PCNN) to combine the structural features, in which the feature vector of a sentence is divided into several segments and does the max pooling for each of them. Then we introduce the Dropout algorithm to prevent the model from overfitting and extend its generalization abilities. We use two datasets in our experiments: Chinese hotel reviews and the Stanford Sentiment TreeBank. Experimental results on the two datasets show that both the PCNN and the Dropout can enhance the performance. The proposed model can achieve 91% accuracy on the Chinese dataset and 45.9% on the English dataset, which are higher than all of the baseline systems.
Keywords:sentiment analysis  deep learning  piecewise convolution neural network  piecewise pooling  Dropout algorithm  
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号