首页 | 本学科首页   官方微博 | 高级检索  
     

基于图深度学习的金融文本多标签分类算法
引用本文:金雨澄,王清钦,高剑,苗仲辰,林越峰,项雅丽,熊贇.基于图深度学习的金融文本多标签分类算法[J].计算机工程,2022,48(4):16-21.
作者姓名:金雨澄  王清钦  高剑  苗仲辰  林越峰  项雅丽  熊贇
作者单位:1. 复旦大学 计算机科学技术学院, 上海 210438;2. 上海市数据科学重点实验室, 上海 200438;3. 上海金融期货信息技术有限公司, 上海 200120
基金项目:国家自然科学基金(U1636207,U1936213);
摘    要:金融文本多标签分类算法可以根据用户需求在海量金融资讯中实现信息检索。为进一步提升金融文本标签识别能力,建模金融文本多标签分类中标签之间的相关性,提出基于图深度学习的金融文本多标签分类算法。图深度学习通过深度网络学习局部和全局的图结构特征,可以刻画节点之间的复杂关系。通过建模标签关联实现标签之间的知识迁移,是构造具有强泛化能力算法的关键。所提算法结合标签之间的关联信息,采用基于双向门控循环网络和标签注意力机制得到的新闻文本对应不同标签的特征表示,通过图神经网络学习标签之间的复杂依赖关系。在真实数据集上的实验结果表明,显式建模标签之间的相关性能够极大地增强模型的泛化能力,在尾部标签上的性能提升尤其显著,相比CAML、BIGRU-LWAN和ZACNN算法,该算法在所有标签和尾部标签的宏观F1值上最高提升3.1%和6.9%。

关 键 词:文本多标签分类  深度学习  图神经网络  注意力网络  金融文本  
收稿时间:2021-03-22
修稿时间:2021-05-18

Multi-label Financial Text Classification Algorithm Based on Graph Deep Learning
JIN Yucheng,WANG Qingqin,GAO Jian,MIAO Zhongchen,LIN Yuefeng,XIANG Yali,XIONG Yun.Multi-label Financial Text Classification Algorithm Based on Graph Deep Learning[J].Computer Engineering,2022,48(4):16-21.
Authors:JIN Yucheng  WANG Qingqin  GAO Jian  MIAO Zhongchen  LIN Yuefeng  XIANG Yali  XIONG Yun
Affiliation:1. School of Computer Science and Technology, Fudan University, Shanghai 200438, China;2. Shanghai Key Laboratory of Data Science, Shanghai 200438, China;3. Shanghai Financial Features Information Technology Co., Ltd., Shanghai 200120, China
Abstract:Multi-label financial text classification can retrieve relevant information from massive financial news according to user needs.To further improve the performance of multi-label financial text classification, this study proposes an algorithm to model the correlation between labels based on graph deep learning.Graph deep learning can describe the complex relationships between nodes by learning local and global graph structure features through deep neural networks.Modeling the correlation between labels can realize knowledge transfer between labels, which is key to constructing an algorithm with strong generalization ability.Therefore, this study utilizes graph neural network to learn the complex dependency between labels based on statistical information along with feature representations extracted using the bi-directional gated recurrent network and label attention mechanism. Experimental results on real world datasets show that modeling label correlations can significantly improve the classification performance, especially on tail labels.Compared with CAML, BIGRU-LWAN and ZACNN algorithms, the proposed algorithm improves the macro F1 values of all labels and tail labels up to 3.1% and 6.9%.
Keywords:multi-label text classification  deep learning  graph neural network  attention network  financial text  
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号