首页 | 本学科首页   官方微博 | 高级检索  
     

层级标签语义引导的极限多标签文本分类策略
引用本文:王嫄,徐涛,王世龙,周宇博,史艳翠.层级标签语义引导的极限多标签文本分类策略[J].中文信息学报,2021,35(10):110-118.
作者姓名:王嫄  徐涛  王世龙  周宇博  史艳翠
作者单位:1.天津科技大学 人工智能学院,天津 300457;
2.普迈康(天津)精准医疗科技有限公司,天津 300000
基金项目:国家自然科学基金(61702367, 61976156, 11803022, 61807024);天津市企业科技特派员项目(20YDTPJC00560);天津市教委科研计划(2017KJ033, 2017KJ034, 2017KJ035, 2018KJ105, 2018KJ106);天津市自然科学基金(19JCYBJC15300)
摘    要:极限多标签文本分类任务具有标签集大、类间关系复杂、数据分布不平衡等特点,是具有挑战性的研究热点。现有模型对标签语义信息利用不足,性能有限。对此,该文提出一种利用层级标签语义信息引导的极限多标签文本分类模型提升策略,在训练和预测过程中给予模型层级标签引导的弱监督语义指导信息,利用这种弱监督信息规约多标签文本分类任务中要对应的多标签语义边界。在标准数据集上的实验结果表明,该文所提策略能够有效提升现有模型性能,尤其在短文本数据集中增效显著,宏精准率最高提升21.23%。

关 键 词:极限多标签文本分类  层级标签  弱监督语义指导  
收稿时间:2021-01-21

An Extreme Multi-label Text Classification Strategy via Hierarchical Label Semantic Guidance
WANG Yuan,XU Tao,WANG Shilong,ZHOU Yubo,SHI Yancui.An Extreme Multi-label Text Classification Strategy via Hierarchical Label Semantic Guidance[J].Journal of Chinese Information Processing,2021,35(10):110-118.
Authors:WANG Yuan  XU Tao  WANG Shilong  ZHOU Yubo  SHI Yancui
Affiliation:1.College of Artificial Intelligence, Tianjin University of Science & Technology, Tianjin 300457, China;2.Population and Precision Health Care, Ltd, Tianjin 300000, China
Abstract:Extreme multi-label text classification (XMTC) is a challenging for large-scale label sets, complex interclass relationship and unbalanced data distribution. To better employ the label semantic information, this paper proposes an XMTC strategy by using semantic guidance from hierarchical labels, which offers weakly supervised semantic information for models to restrict the boundary of multi-label semantics during training and predicting. The experimental results over benchmark datasets show that the strategy can effectively improve performance of existing models, especially for short-text datasets in which the top macro-P of 21.23% is observed.
Keywords:extreme multi-label text classification  hierarchical labels  weakly supervised semantic guidance  
点击此处可从《中文信息学报》浏览原始摘要信息
点击此处可从《中文信息学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号