首页 | 本学科首页   官方微博 | 高级检索  
     

基于全局特征图的半监督微博文本情感分类
引用本文:方澄,李贝,韩萍. 基于全局特征图的半监督微博文本情感分类[J]. 信号处理, 2021, 37(6): 1066-1074. DOI: 10.16798/j.issn.1003-0530.2021.06.018
作者姓名:方澄  李贝  韩萍
作者单位:中国民航大学电子信息与自动化学院
基金项目:中央高校基本科研业务费专项资金(3122018C005)
摘    要:网络社交的流行与普及,使得微博等短文本区别于以往传统文章,具有了独有的文学表达形式和情感发泄方式,导致基于短文本的机器学习情感分析工作难度逐渐增大.针对微博短文本的语言表达新特性,爬取收集大量无情感标记微博数据,建立微博短文本语料库,基于全局语料库构建词与短文本的全局关系图,使用BERT(Bidirectional E...

关 键 词:微博文本  情感分析  图卷积  半监督
收稿时间:2021-03-22

Semi-supervised Microblog Text Sentiment Classification Based on Global Feature Graph
Affiliation:College of Electronic Information and Automation, Civil Aviation University of China
Abstract:Online social networks have gradually become popular and popularization. A number of social networks such as microblog have formed a unique form of literary and emotional expression. Because the expression of microblog is different from the expression of traditional articles, the sentiment analysis research based on short-text machine learning has become more and more difficult. Aiming at the new features of Microblog short text language expression, we crawl and collect a large amount of non-emotionally labeled Microblog data, and build a Microblog short text corpus to create a global relationship graph between words and short texts. The BERT (Bidirectional Encoder Representations from Transformers) document embedding is used as the feature value of the graph node, and graph convolution is used for feature transfer and feature extraction between nodes. We manually annotate non-emotionally labeled Microblog data which sample from the whole Microblog short text corpus. A semi-supervised machine learning method combined with global relationship graph is proposed to improve the performance of sentiment classifier. Experiments show that by increasing the proportion of unmarked data, the method can better capture global features and improve the accuracy of sentiment classification. Comparative experiments are carried out on self-built artificial labeling data, COAE2014 data set and NLP&CC2014 data set. The experimental results show that the method has a good performance in accuracy and recall. 
Keywords:
点击此处可从《信号处理》浏览原始摘要信息
点击此处可从《信号处理》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号