首页 | 本学科首页   官方微博 | 高级检索  
     

基于局部正、负标记相关性的k近邻多标记分类新算法
引用本文:蒋芸,肖潇,侯金泉,陈莉.基于局部正、负标记相关性的k近邻多标记分类新算法[J].计算机工程与科学,2019,41(10):1854-1860.
作者姓名:蒋芸  肖潇  侯金泉  陈莉
作者单位:西北师范大学计算机科学与工程学院,甘肃 兰州,730070;西北师范大学计算机科学与工程学院,甘肃 兰州,730070;西北师范大学计算机科学与工程学院,甘肃 兰州,730070;西北师范大学计算机科学与工程学院,甘肃 兰州,730070
基金项目:国家自然科学基金(61163036);甘肃省科技计划资助自然科学基金(1606RJZA047);2012年度甘肃省高校基本科研业务费专项资金;甘肃省高校研究生导师项目(1201-16);西北师范大学第三期知识与创新工程科研骨干项目(nwnu-kjcxgc-03-67)
摘    要:在多标记学习中,每个样本都由一个实例表示,并与多个类标记相关联。现有的多标记学习算法大多是在全局利用标记相关性,即假设所有的样本共享不同类别标记之间的正相关性。然而,在实际应用中,不同的样本共享不同的标记相关性,标记间不仅存在正相关性,而且存在相互排斥的现象,即负相关性。针对这一问题,提出了基于局部正、负成对标记相关性的k近邻多标记分类算法PNLC。首先,对多标记数据的特征向量进行预处理,分别为每类标记构造对该类标记最具有判别能力的属性特征;然后,在训练阶段,PNLC算法通过所有训练样本中各样本的每个k近邻的真实标记构建标记之间的正、负局部成对相关性矩阵;最后,在测试阶段,首先得到每个测试样例的k近邻及其对应的正、负成对标记关系,利用该标记关系计算最大后验概率对测试样例进行预测。实验结果表明,PNLC算法在yeast和image数据集上的分类准确率明显优于其他常用的多标记分类算法。

关 键 词:多标记学习  正、负相关性  标记独有特征  k近邻
收稿时间:2018-06-13
修稿时间:2019-10-25

A new knn multi-label classification algorithm based on local positive and negative labeling correlation
JIANG Yun,XIAO Xiao,HOU Jin Quan,CHEN Li.A new knn multi-label classification algorithm based on local positive and negative labeling correlation[J].Computer Engineering & Science,2019,41(10):1854-1860.
Authors:JIANG Yun  XIAO Xiao  HOU Jin Quan  CHEN Li
Affiliation:(College of Computer Science & Engineering,Northwest Normal University,Lanzhou 730070,China)
Abstract:In multi-label learning, each sample is represented by a single instance and associates with multiple class labels. Most of existing multi-label learning algorithms explore label correlations globally, by assuming that the positive label correlations are shared by all examples. However, in practical applications, different samples share different label correlations, and there is not only positive correlation among labels, but also mutually exclusive one (i.e., negative correlation). To solve this problem, we propose a KNN multi-label classification algorithm based on local positive and negative label correlation, named PNLC. Firstly, we preprocess the feature vector of multi-label data and construct the most discriminative features for each class. Then, in the training stage, the PNLC algorithm constructs the positive and negative label correlation matrixes by using the truth label of each k-nearest neighbor for all the training samples. Finally, in the test phase, the k-nearest neighbors and corresponding positive and negative pairwise label correlations for each test example are identified to calculate the maximum posterior probability so as to make prediction. Experimental results show that the PNLC algorithm is obviously superior to other well-established multi-label classification algorithms on the yeast and image datasets.
Keywords:multi-label learning  positive and negative correlation  label specific feature  KNN  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号