首页 | 本学科首页   官方微博 | 高级检索  
     

融合文本图卷积和集成学习的文本分类方法
引用本文:周玄郎,邱卫根,张立臣.融合文本图卷积和集成学习的文本分类方法[J].计算机应用研究,2022,39(9).
作者姓名:周玄郎  邱卫根  张立臣
作者单位:广东工业大学 计算机学院 广东省 广州市 510006,广东工业大学 计算机学院 广东省 广州市 510006,广东工业大学 计算机学院 广东省 广州市 510006
基金项目:国家自然科学基金资助项目(61873068)
摘    要:为了提高文本分类的准确率并解决文本图卷积神经网络对节点特征利用不足的问题,提出了一种新的文本分类模型,其内在融合了文本图卷积和Stacking集成学习方法的优点。该模型首先通过文本图卷积神经网络学习文档和词的全局表达以及文档的语法结构信息,再通过集成学习对文本图卷积提取的特征进行二次学习,以弥补文本图卷积节点特征利用不足的问题,提升单标签文本分类的准确率以及整个模型泛化能力。为了降低集成学习的时间消耗,移除了集成学习中的k折交叉验证机制,融合算法实现了文本图卷积和Stacking集成学习方法的关联。在R8、R52、MR、Ohsumed、20NG等数据集上的分类效果相对于传统的分类模型分别提升了1.5%、2.5%、11%、12%、7%以上,该方法在同领域的分类算法比较中表现优异。

关 键 词:文本表示    文本分类    文本图卷积    集成学习    融合模型
收稿时间:2022/3/1 0:00:00
修稿时间:2022/8/18 0:00:00

Text classification combining text graph convolution and ensemble learning
Zhou Xuanlang,Qiu Weigen and Zhang Lichen.Text classification combining text graph convolution and ensemble learning[J].Application Research of Computers,2022,39(9).
Authors:Zhou Xuanlang  Qiu Weigen and Zhang Lichen
Affiliation:Guangdong University of Technology,,
Abstract:In order to improve the accuracy of text classification and solve the problem of insufficient utilization of node features by text graph convolution neural network, this paper proposed a new text classification model, which integrated the advantages of text graph convolution and Stacking integrated learning method. The model firstly learned the global expression of documents and words and the grammatical structure information of documents through text graph convolution neural network, and then secondary learned the features extracted by text graph convolution through integrated learning, so as to make up for the insufficient utilization of text graph convolution node features, and improved the accuracy of single label text classification and the generalization ability of the whole model. In order to reduce the time consumption of ensemble learning, the fusion algorithm removed the k-fold cross verification mechanism in ensemble learning. The fusion algorithm realized the correlation between text graph convolution and Stacking integrated learning method. The classification effect on R8, R52, Mr, Ohsumed, 20NG and other datasets is improved by more than 1.5%, 2.5%, 11%, 12% and 7% respectively compared with the traditional classification model. This method performs well in the comparison of classification algorithms in the same field.
Keywords:text representation  text classification  Text GCN  ensemble learning  fusion model
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号