首页 | 本学科首页   官方微博 | 高级检索  
     

基于粗糙集和多通道词向量的中文文本情感特征分析
引用本文:陈波,谢珺,苗夺谦,王雨竹,续欣莹.基于粗糙集和多通道词向量的中文文本情感特征分析[J].中文信息学报,1986,34(8):94-104.
作者姓名:陈波  谢珺  苗夺谦  王雨竹  续欣莹
作者单位:1.太原理工大学 信息与计算机学院,山西 晋中 030600;
2.太原理工大学 电气与动力工程学院,山西 太原 030024;
3.同济大学 电子与信息工程学院,上海 201804
基金项目:山西省应用基础研究项目(201801D221190,201801D121144)
摘    要:粗糙集是一种能够有效处理不精确、不完备和不确定信息的数学工具,粗糙集的属性约简可以在保持文本情感分类能力不变的情况下对文本情感词特征进行约简。针对情感词特征空间维数过高、情感词特征表示缺少语义信息的问题,该文提出了RS-WvGv中文文本情感词特征表示方法。利用粗糙集决策表对整个语料库进行情感词特征建模,采用Johnson粗糙集属性约简算法对决策表进行化简,保留最小的文本情感词特征属性集,之后再对该集合中的所有情感特征词进行词嵌入表示,最后用逻辑回归分类器验证RS-WvGv方法的有效性。另外,该文还定义了情感词特征属性集覆盖力,用于表示文本情感词特征属性集合对语料库的覆盖能力。最后,在实验对比的过程中,用统计检验进一步验证了该方法的有效性。

关 键 词:属性约简  情感特征提取  词向量  情感分类  

Chinese Text Sentiment Feature Analysis Based on Rough Set and Multi Channel Word Vector
CHEN Bo,XIE Jun,MIAO Duoqian,WANG Yuzhu,XU Xinying.Chinese Text Sentiment Feature Analysis Based on Rough Set and Multi Channel Word Vector[J].Journal of Chinese Information Processing,1986,34(8):94-104.
Authors:CHEN Bo  XIE Jun  MIAO Duoqian  WANG Yuzhu  XU Xinying
Affiliation:1.School of Information and Computer, Taiyuan University of Technology, Jinzhong, Shanxi 030600, China;
2.School of Electrical and Power Engineering, Taiyuan University of Technology, Taiyuan, Shanxi 030024, China;
3.School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
Abstract:Rough set is a mathematical tool that can greatly reduce the dimension and number of text sentiment word features while keeping the ability of text sentiment classification unchanged. Aiming at the problem that the text sentiment word feature dimension is too high and the sentiment word feature representation lacks semantic information, this article proposes a novel Chinese text sentiment word feature representation method named RS-WvGv. The decision table of rough set is used to model the text sentiment word feature of the whole corpus. The Johnson attribute reduction algorithm is applied to simplify the decision table and get the minimum set of text sentiment word feature attributes. And then based on the word embedding of all the sentiment feature words in the set, the RS-WvGv method is verified with logistic regression classifier in the experiment.
Keywords:attribute reduction  sentiment feature extraction  word vector  sentiment classification  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号