首页 | 本学科首页   官方微博 | 高级检索  
     

基于词向量技术与主题词特征的微博立场检测
引用本文:郑海洋,高俊波,邱杰,焦凤.基于词向量技术与主题词特征的微博立场检测[J].计算机系统应用,2018,27(9):118-123.
作者姓名:郑海洋  高俊波  邱杰  焦凤
作者单位:上海海事大学 信息工程学院, 上海 201306,上海海事大学 信息工程学院, 上海 201306,上海海事大学 信息工程学院, 上海 201306,上海海事大学 信息工程学院, 上海 201306
摘    要:微博话题随着移动互联网的发展变得火热起来,单个热门话题可能有数万条评论,微博话题的立场检测是针对某话题判断发言人对该话题的态度是支持的、反对的或中立的.本文一方面由Word2Vec训练语料库中每个词的词向量获取句子的语义信息,另一方面使用TextRank构建主题集作为话题的立场特征,同时结合情感词典获取句子的情感信息,最后将特征选择后的词向量使用支持向量机对其训练和预测完成最终的立场检测模型.实验表明基于主题词及情感词相结合的立场特征可以获得不错的立场检测效果.

关 键 词:立场检测  主题词特征  词向量  立场特征
收稿时间:2018/1/6 0:00:00
修稿时间:2018/1/23 0:00:00

Stance Detection in Chinese Microblog Topic Based on Word Embedding Technology and Thematic Words Feature
ZHENG Hai-Yang,GAO Jun-Bo,QIU Jie and JIAO Feng.Stance Detection in Chinese Microblog Topic Based on Word Embedding Technology and Thematic Words Feature[J].Computer Systems& Applications,2018,27(9):118-123.
Authors:ZHENG Hai-Yang  GAO Jun-Bo  QIU Jie and JIAO Feng
Affiliation:College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China,College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China,College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China and College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
Abstract:With the development of the mobile Internet, Microblog topic has become popular. A single hot topic may have tens of thousands of comments. The stance detection of Microblog topic aims to automatically determine whether the author of a text is in favor of the given target, against the given target, or neither. Firstly, Word2Vec trains out each word of the corpus of vector to extract semantics information from sentence. Then, TextRank keywords extracted method is used to construct the thematic words set as the stance''s feature, meanwhile, the sentiment lexicon is used to extract the sentiment information of the sentence. Finally, the word vector of feature selection is trained and predicted by Support Vector Machine (SVM), so as to complete the model of stance detection. The experimental result shows that the stance feature based on the combination of thematic words and sentiment words can obtain good stance detection effect.
Keywords:stance detection  thematic words feature  word embedding  stance feature
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号