首页 | 本学科首页   官方微博 | 高级检索  
     

基于机器学习的配电网异常缺失数据动态清洗方法
引用本文:梅玉杰,李 勇,周王峰,郭钇秀,邓 威,乔学博.基于机器学习的配电网异常缺失数据动态清洗方法[J].电力系统保护与控制,2023,51(7):158-169.
作者姓名:梅玉杰  李 勇  周王峰  郭钇秀  邓 威  乔学博
作者单位:1.湖南大学电气与信息工程学院,湖南 长沙 410082;2.国网温州供电公司,浙江 温州 325000; 3.国网湖南省电力有限公司电力科学研究院,湖南 长沙 410007
基金项目:国家自然科学基金联合基金重点支持项目资助(U22B200134);国家重点研发计划政府间国际科技创新合作重点项目资助(2022YFE0129300);国网湖南省电力有限公司科技项目资助(5216A521001F)
摘    要:针对传统配电网数据清洗过程中异常数据判断阈值需要人为设定、缺失数据填补效率不佳的局限性,提出基于机器学习的配电网异常缺失数据一体化动态清洗方法。首先,基于局部异常因子检测算法和高斯混合模型,提出一种异常数据动态检测改进算法,实现配电网异常数据阈值的准确自动选择。其次,基于随机森林算法与最小二乘回归法,提出一种配电网缺失数据动态填补算法。根据缺失数据时间长度自适应优化填补算法,在保证数据填补精度的同时降低计算时间。在此基础上,通过异常数据检测和缺失数据填补共同构建一体化动态清洗架构。采用湖南某地区配电网数据进行实例验证,结果表明所提方法可实现异常辨识阈值准确自动选择,有效检测配电网异常数据,并且实现缺失数据填补精度与速度的平衡,具有较好的工程应用价值。

关 键 词:配电网  数据清洗  异常数据辨识  缺失数据填补  高斯混合模型  随机森林
收稿时间:2022/6/29 0:00:00
修稿时间:2022/9/14 0:00:00

Dynamic data cleaning method of abnormal and missing data in a distribution network based on machine learning
MEI Yujie,LI Yong,ZHOU Wangfeng,GUO Yixiu,DENG Wei,QIAO Xuebo.Dynamic data cleaning method of abnormal and missing data in a distribution network based on machine learning[J].Power System Protection and Control,2023,51(7):158-169.
Authors:MEI Yujie  LI Yong  ZHOU Wangfeng  GUO Yixiu  DENG Wei  QIAO Xuebo
Affiliation:1. School of Electrical Engineering and Information, Hunan University, Changsha 410082, China; 2. State Grid Wenzhou Power Supply Co., Ltd., Wenzhou 325000, China; 3. State Grid Human Electric Power Co., Ltd. Research Institute, Changsha 410007, China
Abstract:There is a limitation of manual setting of an abnormal data judgment threshold and there will be inefficient filling of missing data in the traditional process of data cleaning in a distribution network. This paper proposes an integrated dynamic cleaning method for distribution network abnormal and missing data based on machine learning. First, based on a local outlier factor and Gaussian mixture model, an improved dynamic identification algorithm is proposed to realize the automatic selection of threshold of abnormal data. Second, based on the random forest algorithm and least squares regression method, a dynamic filling algorithm for missing data is proposed. Depending on the length of missing data, it adaptively optimizes the filling algorithm to ensure filling accuracy and reduce running time. An integrated dynamic cleaning architecture is built through abnormal data identification and missing data interpolation. The data of the distribution network in a certain area of Hunan are used for example verification. The results show that the proposed method can realize accurate and automatic abnormal data detection and achieve a balance between the filling accuracy and speed of missing data in a distribution network. This has good engineering application value.
Keywords:distribution network  data cleaning  abnormal data identification  missing data interpolation  Gaussian mixture model  random forest
点击此处可从《电力系统保护与控制》浏览原始摘要信息
点击此处可从《电力系统保护与控制》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号