首页 | 本学科首页   官方微博 | 高级检索  
     

基于局部加权重构的化工过程数据恢复算法
引用本文:郭金玉,袁堂明,李元.基于局部加权重构的化工过程数据恢复算法[J].计算机应用,2016,36(1):282-286.
作者姓名:郭金玉  袁堂明  李元
作者单位:沈阳化工大学 信息工程学院, 沈阳 110142
基金项目:国家自然科学基金重大项目(61490701);国家自然科学基金资助项目(61174119);辽宁省教育厅项目(L2013155);辽宁省教育厅重点实验室项目(LZ2015059)。
摘    要:针对化工过程数据中存在缺失数据的问题,在保持局部数据结构特征的基础上提出了基于局部加权重构的化工过程数据恢复算法。通过定位缺失的数据点并以符号NaN(Not a Number)标记,将缺失的数据集分为完备数据集和不完备数据集。不完备的数据集按照完整性的大小依次找到它们在完备数据集中相应的k个近邻,根据误差平方和最小的原则,求出k个近邻相应的权值,用k个近邻及相应的权值重构出缺失的数据点。将该算法应用在不同缺失率下的两种化工过程数据中并与望最大化主成分分析(EM-PCA)法和平均值(MA)两种传统的数据恢复算法相比较,该算法的恢复数据误差最小,并且计算速度相比EM-PCA算法平均提高了2倍。实验结果表明,局部加权重构的化工过程数据恢复算法可以有效地对数据进行恢复,提高了数据的利用率,适用于非线性化工过程缺失数据的恢复。

关 键 词:数据挖掘  缺失数据  数据恢复  k近邻规则  局部加权重构  化工过程  
收稿时间:2015-07-01
修稿时间:2015-09-09

Data recovery algorithm in chemical process based on locally weighted reconstruction
GUO Jinyu,YUAN Tangming,LI Yuan.Data recovery algorithm in chemical process based on locally weighted reconstruction[J].journal of Computer Applications,2016,36(1):282-286.
Authors:GUO Jinyu  YUAN Tangming  LI Yuan
Affiliation:College of Information Engineering, Shenyang University of Chemical Technology, Shenyang Liaoning 110142, China
Abstract:According to phenomenon of missing data in the chemical process, a Locally Weighted Recovery Algorithm (LWRA) for dealing with missing data in the chemical process was proposed based on preserving the local data structure characteristic. The missing data points were located and marked with the symbol NaN (Not a Number), the missing data set was divided into complete data set and incomplete data set. The corresponding k nearest neighbors of incomplete data set were found in the complete data according to the size of integrity in turn, and the corresponding weights of k nearest neighbors were calculated according to the principle of minimum error sum of squares. Finally, the missing data points were reconstructed by k nearest neighbors and their corresponding weights. The algorithm was applied into two types of chemical process data with different missing rates and compared with two traditional data recovery algorithms, Expectation Maximization Principal Component Analysis (EM-PCA) and Mean Algorithm (MA). The results reveal that the proposed method has the lowest error, and the computation speed increases by 2 times in average than EM-PCA. The experimental results demonstrate that the proposed algorithm can not only recover data efficiently but also improve the utilization rate of the data, and it's suitable for nonlinear chemical process data recovery.
Keywords:data mining  missing data  data recovery  k Nearest Neighbor (kNN) rule  locally weighted reconstruction  chemical process  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号