首页 | 本学科首页   官方微博 | 高级检索  
     

短文本分类的ResLCNN模型
引用本文:王俊丽,杨亚星,王小敏.短文本分类的ResLCNN模型[J].软件学报,2017,28(S2):61-69.
作者姓名:王俊丽  杨亚星  王小敏
作者单位:同济大学 电子与信息工程学院 计算机科学与技术系, 上海 201804,同济大学 电子与信息工程学院 计算机科学与技术系, 上海 201804,同济大学 电子与信息工程学院 计算机科学与技术系, 上海 201804
基金项目:国家高技术研究发展计划(863)(2015IM030300);上海市科技创新计划(15DZ1101202);上海市科委项目(14JC1405800);同济大学中央高校基本科研业务费
摘    要:短文本分类是互联网文本数据处理中的关键任务之一.长短时记忆网络LSTM(long short-term memory)和卷积神经网络CNN(convolutional neural network)是广泛应用于短文本分类任务的两种深度学习模型.在计算机视觉和语音识别领域的深度学习研究表明,深层次的神经网络模型具有较好的表达数据特征的能力.受此启发,面向文本深度学习分类问题,提出基于3层LSTM和CNN网络结构的ResLCNN(residual-LSTM-CNN)深度学习模型.该模型有效结合LSTM获取文本序列数据的长距离依赖特征和CNN通过卷积操作获取句子局部特征的优势,同时借鉴残差模型理论,在第1层LSTM层与CNN层之间加入恒等映射,构建残差层,缓解深层模型梯度消失问题.为了探究深层短文本分类中ResLCNN模型的文本分类能力,在多种数据集上将其与LSTM、CNN及其组合模型进行对比实验.结果表明,相比于单层LSTM与CNN组合模型,ResLCNN深层模型在MR、SST-2和SST-5数据集上分别提高了1.0%、0.5%、0.47%的准确率,取得了更好的分类效果.

关 键 词:深度学习模型  短文本分类  长短时记忆网络  卷积神经网络  残差网络
收稿时间:2017/6/30 0:00:00

ResLCNN Model for Short Text Classification
WANG Jun-Li,YANG Ya-Xing and WANG Xiao-Min.ResLCNN Model for Short Text Classification[J].Journal of Software,2017,28(S2):61-69.
Authors:WANG Jun-Li  YANG Ya-Xing and WANG Xiao-Min
Affiliation:Department of Computer Science and Technology, College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China,Department of Computer Science and Technology, College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China and Department of Computer Science and Technology, College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
Abstract:The short text classification is a key task in the field of Internet text data processing. The long short-term memory (LSTM) and convolutional neural network (CNN) are the two most important deep learning models for short text classification. The research on the deep learning in the field of computer vision and speech recognition shows that the deep level of neural network model has better ability to express data features. Inspired by this, a deep learning model named ResLCNN (residual-LSTM-CNN) is proposed based on the structure of three LSTM layers and one CNN layer for text deep learning classification problem. In this model, the LSTM layer is used to capture long distance dependency features of the sequence data and the CNN layer can extract local features of the sentence by convolution operators. The ResLCNN model combines the advantages of LSTM and CNN effectively. At the same time, based on the residual model theory, the ResLCNN model adds an identity mapping between the first LSTM layer and CNN layer to alleviate the problem of vanishing gradients. In order to explore the ability of ResLCNN model for deep short text classification, some experiments are made on several data sets to compare with LSTM, CNN and their combination models. The result shows that compared with the single LSTM and CNN combination model, the ResLCNN deep model improves the accuracy rate by 1.0%, 0.5% and 0.47% respectively on the data sets of MR, SST-2 and SST-5 and achieves better classification results.
Keywords:deep learning model  short text classification  long short-term memory networks  convolutional neural networks  residual network
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号