首页 | 本学科首页   官方微博 | 高级检索  
     

基于层次聚类改进SMOTE的过采样方法
引用本文:王圆方.基于层次聚类改进SMOTE的过采样方法[J].软件,2020(2):201-204.
作者姓名:王圆方
作者单位:;1.山东科技大学计算机科学与工程学院
摘    要:针对SMOTE算法在合成少数类新样本时存在的不足,提出了一种基于层次聚类算法改进的SMOTE过采样法H-SMOTE。该算法首先对少数类样本进行层次聚类,其次根据提出的簇密度分布函数,计算各个簇的簇密度,最后在各个簇中利用改进的SMOTE算法进行过采样,提高合成样本的多样性,得到新的平衡数据集。通过对UCI数据集的实验表明,H-SMOTE算法的分类效果得到明显的提升。

关 键 词:过采样  少数类  层次聚类  SMOTE

Hybrid Algorithm of Aggregation Hierarchy Clustering and SMOTE for Oversampling
WANG Yuan-fang.Hybrid Algorithm of Aggregation Hierarchy Clustering and SMOTE for Oversampling[J].Software,2020(2):201-204.
Authors:WANG Yuan-fang
Affiliation:(College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266590,China)
Abstract:For conventional oversampling algorithms,for example,SMOTE,there are several problems such as ignoring within-class imbalances.Based in the comprehensive consideration of within-class imbalance,an oversampling algorithm,which is a hybrid of Aggregation hierarchy clustering and improved SMOTE(H-SMOTE),is proposed.Firstly,it utilizes the hierarchy clustering to cluster minority class samples.Secondly,according to the proposed cluster density distribution function,the cluster density of each cluster are calculated.Finally,the H-SMOTE algorithm is adopted to oversample on the lines of the location-distant minority class samples in each cluster,the diversity of synthetic samples is improved and a new balanced data set between and within classes is obtained.Experiments on the UCI data sets show that H-SMOTE can effectively improve the classification performance of the classifier for the minority class samples.
Keywords:Oversampling  Minority class  Aggregation hierarchy clustering  SMOTE
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号