首页 | 本学科首页   官方微博 | 高级检索  
     

基于深度半监督的柬语句子级情感分类
引用本文:李超,严馨.基于深度半监督的柬语句子级情感分类[J].计算机应用研究,2021,38(11):3283-3288.
作者姓名:李超  严馨
作者单位:昆明理工大学信息工程与自动化学院,昆明650500;昆明理工大学云南省人工智能重点实验室,昆明650500;云南南天电子信息产业股份有限公司,昆明650040;云南民族大学东南亚语言文化学院,昆明650500;上海师范大学语言研究所,上海200234;昆明理工大学信息工程与自动化学院,昆明650500;昆明理工大学云南省人工智能重点实验室,昆明650500;上海师范大学语言研究所,上海200234
基金项目:国家自然科学基金资助项目(61462055,61562049)
摘    要:针对柬语标注数据较少、语料稀缺,柬语句子级情感分析任务进步缓慢的问题,提出了一种基于深度半监督CNN(convolutional neural networks)的柬语句子级情感极性分类方法.该方法通过融合词典嵌入的分开卷积CNN模型,利用少量已有的柬语情感词典资源提升句子级情感分类任务性能.首先构建柬语句子词嵌入和词典嵌入,通过使用不同的卷积核对两部分嵌入分别进行卷积,将已有情感词典信息融入到CNN模型中去,经过最大延时池化得到最大输出特征,把两部分最大输出特征拼接后作为全连接层输入;然后通过结合半监督学习方法——时序组合模型,训练提出的深度神经网络模型,利用标注与未标注语料训练,降低对标注语料的需求,进一步提升模型情感分类的准确性.结果 证明,通过半监督方法时序组合模型训练,在人工标记数据相同的情况下,该方法相较于监督方法在柬语句子级情感分类任务上准确率提升了3.89%.

关 键 词:柬语句子级情感分类  情感词典嵌入  卷积神经网络  半监督  时序组合模型
收稿时间:2021/4/9 0:00:00
修稿时间:2021/10/13 0:00:00

Sentiment classification of Khmer sentences based on deep semi-supervised
LiChao and YanXin.Sentiment classification of Khmer sentences based on deep semi-supervised[J].Application Research of Computers,2021,38(11):3283-3288.
Authors:LiChao and YanXin
Affiliation:Kunming University of Science and Technology,
Abstract:Aiming at the problems of limited annotation data, scarce corpus and slow progress of Khmer sentence-level sentiment analysis, this paper proposed a Khmer sentence-level sentiment classification method based on deep semi-supervised CNN model. This method combined the separate convolution for word and lexicon embeddings, used a small amount of existing Khmer sentiment lexicon resources to improve sentence-level sentiment classification task performance. First, it constructed the word and lexicon embeddings of Khmer sentence, used different convolution kernels to convolve two-part embeddings respectively, integrated the existing sentiment lexicon information into the CNN model. After the max-over-time pooling, obtaining the maximum output feature. The maximum output features of the two parts were stitched together as the input of the full connection layer. And then, it used the semi-supervised learning method of temporal ensembling training the deep neural network, reduced the need for annotated corpus, and further improved the accuracy of the model''s sentiment classification. The result proves that through the semi-supervised method of temporal ensembling model training, the accuracy of this method is 3.89% higher than that of the supervised method in the Khmer sentence-level sentiment classification task when the artificially labeled data is the same.
Keywords:sentence-level sentiment classification in Khmer  lexicon embedding  CNN  semi-supervision  temporal ensembling model
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号