结合改进密度峰值聚类和共享子空间的协同训练算法 Co-training algorithm combining improved density peak clustering and shared subspace期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

结合改进密度峰值聚类和共享子空间的协同训练算法

引用本文：	吕佳,鲜焱.结合改进密度峰值聚类和共享子空间的协同训练算法[J].计算机应用,2021,41(3):686-693.

作者姓名：	吕佳鲜焱

作者单位：	1. 重庆师范大学计算机与信息科学学院, 重庆 401331;2. 重庆师范大学重庆市数字农业服务工程技术研究中心, 重庆 401331

基金项目：	重庆市研究生科研创新项目;国家自然科学基金重大项目;重庆市高校创新研究群体项目

摘要：	针对协同训练算法在迭代过程中加入的无标记样本的有用信息不足和多分类器对样本标记不一致导致的分类错误累积问题，提出结合改进密度峰值聚类和共享子空间的协同训练算法。该算法先采取属性集合互补的方式得到两个基分类器，然后基于虹吸平衡法则进行改进密度峰值聚类，并从簇中心出发来推进式选择相互邻近度高的无标记样本交由两个基分类器进行分类，最后利用多视图非负矩阵分解算法得到的共享子空间来确定标记不一致样本的最终类别。该算法利用改进密度峰值聚类和相互邻近度选择出更具空间结构代表性的无标记样本，并采用共享子空间来修订标记不一致的样本，解决了因样本误分类造成的分类精度低的问题。在9个UCI数据集上的多组对比实验证明了该算法的有效性，实验结果表明所提算法相较于对比算法在7个数据集上取得最高的分类正确率，在另2个数据集取得次高的分类正确率。
关键词：	协同训练密度峰值聚类虹吸平衡法则共享子空间相互邻近度
收稿时间：	2020-07-24
修稿时间：	2020-10-06
Co-training algorithm combining improved density peak clustering and shared subspace

LYU Jia,XIAN Yan.Co-training algorithm combining improved density peak clustering and shared subspace[J].journal of Computer Applications,2021,41(3):686-693.

Authors:	LYU Jia XIAN Yan

Affiliation:	1. College of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China;2. Chongqing Center of Engineering Technology Research on Digital Agriculture Service, Chongqing Normal University, Chongqing 401331, China

Abstract:	There would be lack of useful information in added unlabeled samples during the iterations of co-training algorithm, meanwhile, the labels of the samples labeled by multiple classifiers may happen to be inconsistent, which would lead to accumulation of classification errors. To solve the above problems, a co-training algorithm combining improved density peak clustering and shared subspace was proposed. Firstly, the two base classifiers were obtained by the complementation of attribute sets. Secondly, an improved density peak clustering was performed based on the siphon balance rule. And beginning from the cluster centers, the unlabeled samples with high mutual neighbor degrees were selected in a progressive manner, then they were labeled by the two base classifiers. Finally, the final categories of the samples with inconsistent labels were determined by the shared subspace obtained by the multi-view non-negative matrix factorization algorithm. In the proposed algorithm, the unlabeled samples with better representation of spatial structure were selected by the improved density peak clustering and mutual neighbor degree, and the same sample labeled by different labels was revised via shared subspace, solving the low classification accuracy problem caused by sample misclassification. The algorithm was validated by comparisons in multiple experiments on 9 UCI datasets, and experimental results show that the proposed algorithm has the highest classification accuracy rate in 7 data sets, and the second highest classification accuracy rate in the other 2 data sets.

Keywords:	co-training density peak clustering siphon balance rule shared subspace mutual neighbor degree
本文献已被万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏