首页 | 本学科首页   官方微博 | 高级检索  
     

基于改进式k-prototypes聚类的坏数据辨识与修正
引用本文:王孝慈,董树锋,刘育权,王莉,李俊格.基于改进式k-prototypes聚类的坏数据辨识与修正[J].电测与仪表,2022,59(2):9-15.
作者姓名:王孝慈  董树锋  刘育权  王莉  李俊格
作者单位:浙江大学电气工程学院,杭州310027;广州供电局有限公司,广州510620
基金项目:国家重点研发计划资助项目(2016YFB0901300)。
摘    要:工业领域很多技术的实现都以准确的负荷数据为基础,而工厂现有的负荷数据测量体系常因为通信、存储等故障,导致负荷数据中出现大量坏数据。因此,提出基于改进式k-prototypes聚类的坏数据辨识与修正方法,通过在聚类中引入非负荷数据特征,削弱负荷坏数据对聚类结果的影响,使坏数据辨识和修复结果更准确。改进式k-prototypes算法通过随机初始化,并行计算择优,克服了标准k-prototypes容易随初始聚类中心陷入局部最优解的缺陷;并通过聚类数量的自适应处理,解决了主观决定聚类数量的问题。基于聚类结果,根据正态分布原则确定负荷数据可行域,识别坏数据,并利用类中心置换法进行修正。实验表明,该方法较只考虑负荷数据的模糊均值聚类法效果更好,坏数据识别的召回率与修正的准确率显著提高。

关 键 词:k-prototypes聚类  混合数据集聚类  坏数据辨识  类中心置换修正法  工业负荷预处理
收稿时间:2020/1/23 0:00:00
修稿时间:2020/2/16 0:00:00

Industrial Load Data Identification and Correction Method with Improved K-prototypes Clustering Algorithm
Wang Xiaoci,Dong Shufeng,Liu Yuquan,Wang Li and Li Junge.Industrial Load Data Identification and Correction Method with Improved K-prototypes Clustering Algorithm[J].Electrical Measurement & Instrumentation,2022,59(2):9-15.
Authors:Wang Xiaoci  Dong Shufeng  Liu Yuquan  Wang Li and Li Junge
Affiliation:(School of Electrical Engineering,Zhejiang University,Hangzhou 310027,China;Guangzhou Power Supply Bureau Co.,Ltd.,Guangzhou 510620,China)
Abstract:The realization of many technologies in the industrial field is based on accurate load data,while the existing measurement system in factories often results in a large number of bad data due to communication and storage failures.Therefore,an industrial load data identification and correction method based on improved k-prototypes clustering algorithm is proposed to reduce the impact of bad load data on the clustering results by introducing characteristics of non-load data in clustering,so as to make the identification and repair results more accurate.Through random initialization and parallel calculation optimization,the improved k-prototypes algorithm overcomes the defect that standard algorithm tends to fall into the local optimal solution.And the problem of subjectively determining the number of clusters is solved by adaptive processing.Based on the clustering results,the feasible region of load data is determined according to the principle of normal distribution,and the bad data is identified.The identified bad data is corrected by centroid vector replacing.Experiments show that the proposed method outperforms the fuzzy C-means clustering method which only considers the load data,and the recall rate and correction accuracy of bad data identification are significantly improved.
Keywords:k-prototypes clustering  mixed dataset clustering  bad data identification  correction with centroid vector replacing  industrial load data preprocessing
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《电测与仪表》浏览原始摘要信息
点击此处可从《电测与仪表》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号