首页 | 本学科首页   官方微博 | 高级检索  
     

基于预测数据特征的空气质量预测方法
引用本文:高铭壑,张莹,张蓉蓉,黄子豪,黄琳焱,李繁菀,张昕,王彦浩.基于预测数据特征的空气质量预测方法[J].山东大学学报(工学版),2020,50(2):91-99.
作者姓名:高铭壑  张莹  张蓉蓉  黄子豪  黄琳焱  李繁菀  张昕  王彦浩
作者单位:1. 华北电力大学控制与计算机工程学院, 北京 1022062. 长春理工大学计算机科学技术学院, 吉林 长春 130022
基金项目:中央高校基本科研业务费专项资金(2018MS024);国家自然科学基金资助项目(61305056);吉林省科技发展计划项目(20190303133SF)
摘    要:采用LightGBM预测模型对空气质量预测问题进行研究,提出并设计一种基于预测性特征的空气质量预测方法,有效地预测北京市区内未来24 h核心表征空气质量的PM2.5质量浓度。在构建预测方案过程中,分析训练数据集特性开展数据清洗,利用随机森林与线性插值相结合的方法,解决数据大量缺失以及噪声干扰问题;提出使用预测性数据特征方法,同时设计相关统计特征,提高预测结果的准确性;采用滑窗机制挖掘高维时间特征,增加数据特征数量级;对预测模型的工作性能和结果进行详细分析,并结合基线模型进行对比评价。试验结果表明,基于预测性特征结合采用LightGBM预测模型的方案具有更高的预测精度。

关 键 词:预测数据融合  高维统计特征  空气质量预测  机器学习  
收稿时间:2019-07-18

Air quality prediction approach based on integrating forecasting dataset
Minghe GAO,Ying ZHANG,Rongrong ZHANG,Zihao HUANG,Linyan HUANG,Fanyu LI,Xin ZHANG,Yanhao WANG.Air quality prediction approach based on integrating forecasting dataset[J].Journal of Shandong University of Technology,2020,50(2):91-99.
Authors:Minghe GAO  Ying ZHANG  Rongrong ZHANG  Zihao HUANG  Linyan HUANG  Fanyu LI  Xin ZHANG  Yanhao WANG
Affiliation:1. School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China2. School of Computer Science and Technology, Changchun University of Science and Technology, Jilin 130022, China
Abstract:Towarding the air quality prediction research problem, LightGBM was employed to propose and design a predictive feature-based air quality prediction approach, which could effectively predict the PM2.5 concentration, i.e., the key indicator reflecting air quality, in the upcoming 24-hour within Beijing. During constructing the prediction solution, the features of the training data set was analyzed to execute data cleansing, and the methods of random forest and linear interpolation were used to solve the problem of high data loss and noise interference. The predictive data features were integrated into the dataset, and meanwhile the corresponding statistical features were designed to imiprove the prediction accurancy. The sliding window mechanism was used to mine high-dimensional time features and increase the quantity of data features. The performance and result of the proposed approach were analyzed in details through comparing with the basedline models. The experimental results showed that compared with other model methods, the proposed LightGBM-based prediction approach with integrating forecasting data had higher prediction accuracy.
Keywords:predictive data fusion  high dimensional statistical features  air quality prediction  machine learning  
本文献已被 CNKI 等数据库收录!
点击此处可从《山东大学学报(工学版)》浏览原始摘要信息
点击此处可从《山东大学学报(工学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号