首页 | 本学科首页   官方微博 | 高级检索  
     

用于文本分类的均值原型网络
引用本文:线岩团,相艳,余正涛,文永华,王红斌,张亚飞. 用于文本分类的均值原型网络[J]. 中文信息学报, 2020, 34(6): 73-80+88
作者姓名:线岩团  相艳  余正涛  文永华  王红斌  张亚飞
作者单位:1.昆明理工大学 信息工程与自动化学院,云南 昆明 650500;
2.昆明理工大学 云南省人工智能重点实验室,云南 昆明 650500
基金项目:国家重点研发计划(2018YFC0830105,2018YFC0830100);国家自然科学基金(61732005,61672271,61562052,61762056)
摘    要:文本分类是自然语言处理的基本任务之一。该文在原型网络基础上,提出了按时序移动平均方式集成历史原型向量的均值原型网络,并将均值原型网络与循环神经网络相结合,提出了一种新的文本分类模型。该模型利用单层循环神经网络学习文本的向量表示,通过均值原型网络学习文本类别的向量表示,并利用文本向量与原型向量的距离训练模型并预测文本类别。与己有的神经网络文本分类方法相比,模型在训练和预测过程中有效利用了样本间的特征相似关系,并具有网络深度浅、参数少的特点。该方法在多个公开的文本分类数据集上取得了最好的分类准确率。

关 键 词:文本分类  均值原型网络  自集成学习

Mean Prototypical Networks for Text Classification
XIAN Yantuan,XIANG Yan,YU Zhengtao,WEN Yonghua,WANG Hongbin,ZHANG Yafei. Mean Prototypical Networks for Text Classification[J]. Journal of Chinese Information Processing, 2020, 34(6): 73-80+88
Authors:XIAN Yantuan  XIANG Yan  YU Zhengtao  WEN Yonghua  WANG Hongbin  ZHANG Yafei
Affiliation:1.Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, China;
2.Yunnan Key Laboratory of Artificial Intelligence, Kunming University of Science and Technology, Kunming, Yunnan 650500, China
Abstract:Text classification is a fundamental issue of natural language processing. Based on the prototypical networks, this paper proposes a mean prototype network by an integrating different time steps prototype vectors through moving average, and then combining the mean prototype network with a simple RNN to propose a novel text classification model. The model uses a single-layer RNN to learn the vector representation of text, and learns categories vector representation by the mean prototype networks. The model applies the distance between the text vector and the prototype vector to train the model and predict the text category. Compared with the existing neural text classification method, the model is featured by the shallow depth and fewer parameters, and the introduction of similarity between samples in training and prediction process. The proposed method achieves state-of-the-art results on five benchmark datasets for text classification.
Keywords:text classification    mean prototype network    self-ensemble  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号