首页 | 本学科首页   官方微博 | 高级检索  
     

基于语义相似度的情感特征向量提取方法
引用本文:林江豪,周咏梅,阳爱民,陈锦.基于语义相似度的情感特征向量提取方法[J].计算机科学,2017,44(10):296-301.
作者姓名:林江豪  周咏梅  阳爱民  陈锦
作者单位:广东外语外贸大学语言工程与计算实验室 广州510006,广东外语外贸大学语言工程与计算实验室 广州510006;广东外语外贸大学思科信息学院 广州510006,广东外语外贸大学语言工程与计算实验室 广州510006;广东外语外贸大学思科信息学院 广州510006,广东外语外贸大学语言工程与计算实验室 广州510006;广东外语外贸大学国际学院 广州510420
基金项目:本文受国家社科基金项目(12BYY045)资助
摘    要:针对现有情感特征在语义表达和领域拓展等方面的不足,提出了一种基于语义相似度的情感特征向量提取方法。利用25万篇sogou新闻语料和50万条微博语料,训练得到Word2vec模型;选择80个情感明显、内容丰富、词性多样化的情感词作为种子词集;通过计算候选情感词与种子词的词向量之间的语义相似度,将情感词映射到高维向量空间,实现了情感词的特征向量表示(Senti2vec)。将Senti2vec应用于情感近义词和反义词相似度分析、情感词极性分类和文本情感分析任务中,实验结果表明Senti2vec能实现情感词的语义表示和情感表示。基于大规模语料的语义相似计算,使得提取的情感特征更具有领域拓展性。

关 键 词:情感特征向量  语义相似度  情感词  Word2vec
收稿时间:2016/10/3 0:00:00
修稿时间:2016/12/22 0:00:00

Extraction Method of Sentimental Feature Vector Based on Semantic Similarity
LIN Jiang-hao,ZHOU Yong-mei,YANG Ai-min and CHENG Jin.Extraction Method of Sentimental Feature Vector Based on Semantic Similarity[J].Computer Science,2017,44(10):296-301.
Authors:LIN Jiang-hao  ZHOU Yong-mei  YANG Ai-min and CHENG Jin
Affiliation:Laboratory for Language Engineering and Computing,Guangdong University of Foreign Studies,Guangzhou 510006,China,Laboratory for Language Engineering and Computing,Guangdong University of Foreign Studies,Guangzhou 510006,China;Cisco School of Informatics,Guangdong University of Foreign Studies,Guangzhou 510006,China,Laboratory for Language Engineering and Computing,Guangdong University of Foreign Studies,Guangzhou 510006,China;Cisco School of Informatics,Guangdong University of Foreign Studies,Guangzhou 510006,China and Laboratory for Language Engineering and Computing,Guangdong University of Foreign Studies,Guangzhou 510006,China;International College,Guangdong University of Foreign Studies,Guangzhou 510420,China
Abstract:In order to fill the gap of the semantic representation and domain expansion on sentimental features,in this paper,an extraction method of sentimental feature vector based on semantic similarity was proposed.First of all,the Word2vec model is trained based on 250 thousand sogou news texts and 500 thousand micro-blog texts.Eighty sentimental words,which are obvious sentiment,rich content and diverse POS,are chosen as a set of seed words.Then,the semantic similarity between the candidate sentimental words and the seed words are calculated based on their word vectors.The sentimental words are mapped to the high dimensional vector space and the feature vector representation (Senti2vec) is extracted.Senti2vec is applied into the similarity analysis of sentimental synonyms and antonyms,polarity classification of sentimental words and sentimental text analysis.The experimental results show that Senti2vec can represent the meaning and sentiment of the sentimental words.Senti2vec is based on semantic similarity calculation from large scale of data,which enables this method more adaptable into different domains.
Keywords:Sentimental feature vector  Semantic similarity  Sentiment word  Word2vec
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号