首页 | 本学科首页   官方微博 | 高级检索  
     

基于标签语义相似的动态多标签文本分类算法
引用本文:姚佳奇,徐正国,燕继坤,熊钢,李智翔.基于标签语义相似的动态多标签文本分类算法[J].计算机工程与应用,2020,56(19):94-98.
作者姓名:姚佳奇  徐正国  燕继坤  熊钢  李智翔
作者单位:盲信号处理重点实验室,成都 610041
摘    要:针对标签随着时间变化的动态多标签文本分类问题,提出了一种基于标签语义相似的动态多标签文本分类算法。该算法在训练阶段,首先按照标签固定训练得到一个基于卷积神经网络的多标签文本分类器,然后以该分类器的倒数第二层的输出为文本的特征向量。由于该特征向量是在有标签训练得到的,因而相对于基于字符串即文本内容而言,该特征向量含有标签语义信息。在测试阶段,将测试文档输入训练阶段的多标签文本分类器获取相应的特征向量,然后计算相似性,同时乘以时间衰减因子修正,使得时间越近的文本具有较高的相似性。最后,采用最近邻算法分类。实验结果表明,该算法在处理动态多标签文本分类问题上具有较优的性能。

关 键 词:动态多标签  文本分类  神经网络  标签语义相似  

Dynamic Multi-label Text Classification Algorithm Based on Label Semantic Similarity
YAO Jiaqi,XU Zhengguo,YAN Jikun,XIONG Gang,LI Zhixiang.Dynamic Multi-label Text Classification Algorithm Based on Label Semantic Similarity[J].Computer Engineering and Applications,2020,56(19):94-98.
Authors:YAO Jiaqi  XU Zhengguo  YAN Jikun  XIONG Gang  LI Zhixiang
Affiliation:National Key Laboratory of Science and Technology on Blind Signal Processing, Chengdu 610041, China
Abstract:To solve the problem of dynamic multi-label text classification with time-varying labels, a dynamic multi label text classification algorithm based on label semantic similarity is proposed. In the training phase, a multi-label text classifier based on convolutional neural network is trained, and then the output of the penultimate layer of the classifier is taken as the feature vector of the text. Because the feature vector is trained with labels, it contains label semantic information compared with the content-based feature vector. In the test phase, the test document is input into the multi label text classifier in the training phase to obtain the corresponding feature vector, and then the cosine similarity is calculated. At the same time, a time attenuation factor is added to make the recent text have a higher similarity value. Finally, the nearest neighbor algorithm is used for classification. The experimental results show that the proposed algorithm has better performance in dealing with dynamic multi-label text classification problem.
Keywords:dynamic multi-label  text classification  neural networks  label semantic similarity  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号