首页 | 本学科首页   官方微博 | 高级检索  
     

基于循环卷积多任务学习的多领域文本分类方法
引用本文:谢金宝,李嘉辉,康守强,王庆岩,王玉静.基于循环卷积多任务学习的多领域文本分类方法[J].电子与信息学报,2021,43(8):2395-2403.
作者姓名:谢金宝  李嘉辉  康守强  王庆岩  王玉静
作者单位:1.广东科学技术职业学院机器人学院 珠海 5190902.哈尔滨理工大学电气与电子工程学院 哈尔滨 150000
基金项目:基于工业互联网的协作式智能机器人产教融合创新应用平台(2020CJPT004),黑龙江省自然科学基金(LH2019E058),智能机器人湖北省重点实验室开放基金(HBIR202004),黑龙江省普通高校基本科研业务费专项资金(LGYC2018JC027)
摘    要:文本分类任务中,不同领域的文本很多表达相似,具有相关性的特点,可以解决有标签训练数据不足的问题。采用多任务学习的方法联合学习能够将不同领域的文本利用起来,提升模型的训练准确率和速度。该文提出循环卷积多任务学习(MTL-RC)模型用于文本多分类,将多个任务的文本共同建模,分别利用多任务学习、循环神经网络(RNN)和卷积神经网络(CNN)模型的优势获取多领域文本间的相关性、文本长期依赖关系、提取文本的局部特征。基于多领域文本分类数据集进行丰富的实验,该文提出的循环卷积多任务学习模型(MTL-LC)不同领域的文本分类平均准确率达到90.1%,比单任务学习模型循环卷积单任务学习模型(STL-LC)提升了6.5%,与当前热门的多任务学习模型完全共享多任务学习模型(FS-MTL)、对抗多任务学习模型(ASP-MTL)、间接交流多任务学习框架(IC-MTL)相比分别提升了5.4%, 4%和2.8%。

关 键 词:多领域文本分类    多任务学习    循环神经网络    卷积神经网络
收稿时间:2020-10-09

A Multi-domain Text Classification Method Based on Recurrent Convolution Multi-task Learning
Jinbao XIE,Jiahui LI,Shouqiang KANG,Qingyan WANG,Yujing WANG.A Multi-domain Text Classification Method Based on Recurrent Convolution Multi-task Learning[J].Journal of Electronics & Information Technology,2021,43(8):2395-2403.
Authors:Jinbao XIE  Jiahui LI  Shouqiang KANG  Qingyan WANG  Yujing WANG
Affiliation:1.School of Robotic, Guangdong Polytechnic of Science and Technology, Zhuhai 519090, China2.School of Electrical and Electronic Engineering, Harbin University of Science and Technology, Harbin 150000, China
Abstract:In the text classification task, many texts in different domains are similarly expressed and have the characteristics of correlation, which can solve the problem of insufficient training data with labels. The text of different fields can be combined with the multi-task learning method, and the training accuracy and speed of the model can be improved. A Recurrent Convolution Multi-Task Learning (MTL-RC) model for text multi-classification is proposed, jointly modeling the text of multiple tasks, and taking advantage of multi-task learning, Recurrent Neural Network(RNN) and Convolutional Neural Network(CNN) models to obtain the correlation between multi-domain texts, long-term dependence of text. Local features of text are extracted. Rich experiments are carried out based on multi-domain text classification datasets, the Recurrent Convolution Multi-Task Learning(MTL-LC) proposed in this paper has an average accuracy of 90.1% for text classification in different fields, which is 6.5% higher than the single-task learning model STL-LC. Compared with mainstream multi-tasking learning models Full Shared Multi-Task Learning(FS-MTL), Adversarial Multi-Task Learninng(ASP-MTL), and Indirect Communciation for Multi-Task Learning(IC-MTL) have increased by 5.4%, 4%, and 2.8%, respectively.
Keywords:
点击此处可从《电子与信息学报》浏览原始摘要信息
点击此处可从《电子与信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号