首页 | 本学科首页   官方微博 | 高级检索  
     

基于word2vec和SVMperf的中文评论情感分类研究
引用本文:张冬雯,杨鹏飞,许云峰.基于word2vec和SVMperf的中文评论情感分类研究[J].计算机科学,2016,43(Z6):418-421, 447.
作者姓名:张冬雯  杨鹏飞  许云峰
作者单位:河北科技大学信息科学与工程学院 石家庄050018,河北科技大学信息科学与工程学院 石家庄050018,河北科技大学信息科学与工程学院 石家庄050018
摘    要:利用有监督的机器学习的方法来对中文产品评论文本进行情感分类,该方法结合了word2vec和SVMperf两种工具。先由word2vec训练出语料中每个词语的词向量,通过计算相互之间的余弦距离来达到相似概念词语聚类的目的,通过相似特征聚类将高相似度领域词汇扩充到情感词典;再使用word2vec训练出词向量的高维度表示;然后采用主成分分析方法(PCA)对高维度向量进行降低维度处理,形成特征向量;最后使用两种方法抽取有效的情感特征,由SVMperf进行训练和预测,从而完成文本的情感分类。实验结果表明,采用相似概念聚类方法对词典进行扩充任务或情感分类任务都可以获得很好的效果。

关 键 词:情感分类  word2vec  SVMperf  语义特征  PCA

Research of Chinese Comments Sentiment Classification Based on Word2vec and SVMperf
ZHANG Dong-wen,YANG Peng-fei and XU Yun-feng.Research of Chinese Comments Sentiment Classification Based on Word2vec and SVMperf[J].Computer Science,2016,43(Z6):418-421, 447.
Authors:ZHANG Dong-wen  YANG Peng-fei and XU Yun-feng
Affiliation:School of Information Science and Engineering,Hebei University of Science and Technology,Shijiazhuang 050018,China,School of Information Science and Engineering,Hebei University of Science and Technology,Shijiazhuang 050018,China and School of Information Science and Engineering,Hebei University of Science and Technology,Shijiazhuang 050018,China
Abstract:In this paper,we used the machine learning method to classify the sentiment classification of Chinese product reviews.The method combines SVMperf and word2vec.Word2vec trains out each word of the corpus of word vectors.By computing the cosine distance between each other,a similar concept word clustering is achieved,and with similar feature clustering, the vocabulary of the high similarity in the field is expanded to sentiment lexicon.The high dimensional representation of the word vector is trained out using word2vec.PCA principal component analysis method is used to reduce the dimension of the high dimensional vector,and the feature vector is formed.We used two different method to extract the effective affective feature,which is trained and predicted by SVMperf,so as to complete the sentiment classification of the text.The experimental results show that the method can obtain good results,regardless using the similar concept clustering method to expand the task or complete the emotional classification task.
Keywords:Sentiment classification  Word2vec  SVMperf  Semantic features  PCA
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号