首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进的BSMOTE和时序特征的风机故障采样算法
引用本文:杨鲜,赵计生,强保华,米路中,彭博,唐成华,李宝莲.基于改进的BSMOTE和时序特征的风机故障采样算法[J].计算机应用,2021,41(6):1673-1678.
作者姓名:杨鲜  赵计生  强保华  米路中  彭博  唐成华  李宝莲
作者单位:1. 广西图像图形与智能处理重点实验室(桂林电子科技大学), 广西 桂林 541004;2. 北京华电天仁电力控制技术有限公司, 北京 100039;3. 中国电子科技集团公司第54研究所, 石家庄 050081
基金项目:国家自然科学基金资助项目(61762025,62062028);广西重点研发计划项目(AB18126053,AB18126063,AD18281002);国家能源科技环保集团股份有限公司资助项目(IKY.2019.0002);广西自然科学基金资助项目(2017GXNSFAA198226,2019GXNSFDA185007,2019GXNSFDA185006,2018GXNSFAA294058);广西科技重大专项(AA18118031,AA18242028);中电科54所发展基金资助项目(SXX18138X017);桂林电子科技大学研究生教育创新项目(2019YCXS051,2020YCXS052);广西图像图形与智能处理重点实验室基金资助项目(GIIP201603,GIIP1806)。
摘    要:针对风机数据集的不平衡问题,提出了一种BSMOTE-Sequence采样算法,在合成新样本时综合考虑空间和时间特征,并对新样本进行清洗,从而有效减少噪声点的生成。首先,根据每个少数类样本的近邻样本的类别比例,将少数类样本划分为安全类样本、边界类样本和噪声类样本。然后,对每个边界类样本都遴选出空间距离、时间跨度最接近的少数类样本集,利用线性插值法合成新样本,并过滤掉噪声类样本以及类间重叠样本。最后,以支持向量机(SVM)、卷积神经网络(CNN)、长短期记忆(LSTM)人工神经网络作为风机齿轮箱故障检测模型,F1-Score、曲线下面积(AUC)和G-mean作为模型性能评价指标,在真实风机数据集上把所提算法与常用的多种采样算法进行对比,实验结果表明:相比已有算法,BSMOTE-Sequence算法所生成样本的分类效果更好,使得检测模型的F1-Score、AUC和G-mean平均提高了3%,该算法能有效地适用于数据具有时序规律且不平衡的风机故障检测领域。

关 键 词:风机故障检测  不均衡数据  时序特征  采样算法  类间重叠样本  
收稿时间:2020-09-07
修稿时间:2020-12-16

Wind turbine fault sampling algorithm based on improved BSMOTE and sequential characteristics
YANG Xian,ZHAO Jisheng,QIANG Baohua,MI Luzhong,PENG Bo,TANG Chenghua,LI Baolian.Wind turbine fault sampling algorithm based on improved BSMOTE and sequential characteristics[J].journal of Computer Applications,2021,41(6):1673-1678.
Authors:YANG Xian  ZHAO Jisheng  QIANG Baohua  MI Luzhong  PENG Bo  TANG Chenghua  LI Baolian
Affiliation:1. Guangxi Key Laboratory of Image and Graphic Intelligent Processing(Guilin University of Electronic Technology), Guilin Guangxi 541004, China;2. Beijing Huadian Tianren Electric Power Control Technology Company Limited, Beijing 100039, China;3. The 54 th Research Institute of China Electronics Technology Group Corporation, Shijiazhuang Hebei 050081, China
Abstract:To solve the imbalance problem of wind turbine dataset, a Borderline Synthetic Minority Oversampling Technique-Sequence (BSMOTE-Sequence) sampling algorithm was proposed. In the algorithm, when synthesizing new samples, the space and time characteristics were considered comprehensively, and the new samples were cleaned, so as to effectively reduce the generation of noise points. Firstly, the minority class samples were divided into security class samples, boundary class samples and noise class samples according to the class proportion of the nearest neighbor samples of each minority class sample. Secondly, for each boundary class sample, the minority class sample set with the closest spatial distance and time span was selected, the new samples were synthesized by linear interpolation method, and the noise class samples and the overlapping samples between classes were filtered out. Finally, Support Vector Machine (SVM), Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) were used as the fault detection models of wind turbine gear box, and F1-Score, Area Under Curve (AUC) and G-mean were used as performance evaluation indices of the models, and the proposed algorithm was compared with other sampling algorithms on real wind turbine datasets. Experimental results show that, compared with those of the existing algorithms, the classification effect of the samples generated by BSMOTE-Sequence algorithm is better with an average increase of 3% in F1-Score, AUC and G-mean of the detection models. The proposed algorithm can be effectively applicable to the field of wind turbine fault detection where the data with sequential rule is imbalanced.
Keywords:wind turbine fault detection  imbalanced data  sequential characteristic  sampling algorithm  overlapping sample between classes  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号