首页 | 官方网站   微博 | 高级检索  
     

结合LDA与Self-Attention的短文本情感分类方法
引用本文:陈欢,黄勃,朱翌民,俞雷,余宇新.结合LDA与Self-Attention的短文本情感分类方法[J].计算机工程与应用,2020,56(18):165-170.
作者姓名:陈欢  黄勃  朱翌民  俞雷  余宇新
作者单位:1.上海工程技术大学 电子电气工程学院,上海 201620 2.江西省经济犯罪侦查与防控技术协同创新中心,南昌 330103 3.上海外国语大学 国际金融贸易学院,上海 201620
基金项目:国家自然科学基金;江西省经济犯罪侦查与防控技术协同创新中心开放基金
摘    要:在对短文本进行情感分类任务的过程中,由于文本长度过短导致数据稀疏,降低了分类任务的准确率。针对这个问题,提出了一种基于潜在狄利克雷分布(LDA)与Self-Attention的短文本情感分类方法。使用LDA获得每个评论的主题词分布作为该条评论信息的扩展,将扩展信息和原评论文本一起输入到word2vec模型,进行词向量训练,使得该评论文本在高维向量空间实现同一主题的聚类,使用Self-Attention进行动态权重分配并进行分类。通过在谭松波酒店评论数据集上的实验表明,该算法与当前主流的短文本分类情感算法相比,有效地提高了分类性能。

关 键 词:主题词  短文本  Self-Attention  潜在狄利克雷分布(LDA)  word2vec  

Short Text Emotion Classification Method Combining LDA and Self-Attention
CHEN Huan,HUANG Bo,ZHU Yimin,YU Lei,YU Yuxin.Short Text Emotion Classification Method Combining LDA and Self-Attention[J].Computer Engineering and Applications,2020,56(18):165-170.
Authors:CHEN Huan  HUANG Bo  ZHU Yimin  YU Lei  YU Yuxin
Affiliation:1.School of Electrical and Electronic Engineering, Shanghai University of Engineering and Technology, Shanghai 201620, China 2.Jiangxi Collaborative Innovation Center for Economic Crime Detection and Prevention and Control, Nanchang 330103, China 3.School of Economics and Finance, Shanghai International Studies University, Shanghai 201620, China
Abstract:In the process of the short text emotional classification tasks, the data is sparse due to the short text length, which reduces the accuracy of classification tasks. To solve this problem, this paper proposes a short text emotional classification method based on Latent Dirichlet Allocation(LDA) and Self-Attention. LDA is used to obtain the topic word distribution of each comment as the extension of the comment information. The extended information and the original comment text are input into word2vec model to train the word vector, so that the comment text can cluster the same topic in high-dimensional vector space. Self-Attention is used for dynamic weight allocation and classification. The experiment on Tan Songbo hotel review data set shows that the algorithm in this paper improves the classification performance effectively compared with the current mainstream short text emotional classification algorithm.
Keywords:topic word  short text  Self-Attention  Latent Dirichlet Allocation(LDA)  word2vec  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号