首页 | 本学科首页   官方微博 | 高级检索  
     

基于CNN特征空间的微博多标签情感分类
引用本文:孙松涛,何炎祥. 基于CNN特征空间的微博多标签情感分类[J]. 四川大学学报(工程科学版), 2017, 49(3): 162-169
作者姓名:孙松涛  何炎祥
作者单位:武汉大学计算机学院,武汉大学计算机学院
基金项目:国家自然科学基金:61303115,面向微博平台的短文本话题检测与跟踪研究
摘    要:面对微博情感评测任务中的多标签分类问题时,基于向量空间模型的传统文本特征表示方法难以提供有效的语义特征。词向量表示能体现词语的语法和语义关系,并依据语义合成原理构建句子的特征表示。本文提出一个针对微博句子的多标签情感分类系统,采用经过有监督情感分类学习后的卷积神经网络(Convolution Neural Network, CNN)模型,将词向量合成为微博句子的向量表示,使得此CNN特征空间中的句子向量具有很好的情感语义区分度。在2013年NLPCC(Natural Language Processing and Chinese Computing)会议的微博情感评测公开数据集上,相比最优评测结果的宽松指标和严格指标,本系统的最佳分类性能分别提升了19.16%和17.75%;相比目前已知文献中的最佳分类性能,则分别提升了3.66%和2.89%。

关 键 词:情感分类  多标签分类  词向量表示  卷积神经网络  语义合成
收稿时间:2016-08-07
修稿时间:2017-01-02

Multi-label Emotion Classification for Microblog Based on CNN Feature Space
SUN Songtao and HE Yanxiang. Multi-label Emotion Classification for Microblog Based on CNN Feature Space[J]. Journal of Sichuan University (Engineering Science Edition), 2017, 49(3): 162-169
Authors:SUN Songtao and HE Yanxiang
Affiliation:1. School of Computer, Wuhan Univ., Wuhan 430072, China;2. State Key Lab. of Software Eng., Wuhan Univ., Wuhan 430072, China
Abstract:While the evaluation task of microblog emotion is a multi-label classification problem, the traditional text representing methods, which are usually based on vector space model, fail to provide more effective semantic features. Word embedding can capture the syntax and semantic relations between words, and build sentence representing according to semantic compositionality. A multi-label emotion classification system was proposed in this paper. It exploited Convolution Neural Network (CNN) model, which had been trained as a supervised emotion classifier, to composite the feature vector for sentences from microblog. These sentence vectors in CNN feature space were well discriminated by their emotion semantic. On the open dataset from microblog emotion evaluation task of NLPCC (Natural Language Processing and Chinese Computing) conference in 2013, the best performance of the proposed system achieved 19.16% and 17.75% improvement in loose metric and strict metric respectively, comparing to the best performance of all the evaluation results. It also outperformed the state of art best results by 3.66% and 2.89% on the two metric.
Keywords:emotion classification   multi-label classification   word embedding   convolution neural network   semantic compositionality
点击此处可从《四川大学学报(工程科学版)》浏览原始摘要信息
点击此处可从《四川大学学报(工程科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号