首页 | 本学科首页   官方微博 | 高级检索  
     

基于随机森林和气象参数的PM2.5浓度等级预测
引用本文:任才溶,谢刚.基于随机森林和气象参数的PM2.5浓度等级预测[J].计算机工程与应用,2019,55(2):213-220.
作者姓名:任才溶  谢刚
作者单位:太原理工大学 信息工程学院,太原,030024;太原理工大学 信息工程学院,太原 030024;太原科技大学 电子信息工程学院,太原 030024
基金项目:山西省回国留学人员科研项目;国家自然科学基金;国家自然科学基金
摘    要:空气污染不仅危害人类的身心健康,而且还会制约城市的经济发展,其中PM2.5带来的影响尤为突出。为了方便准确地预测出空气中的PM2.5浓度等级,提出了一种基于随机森林的PM2.5浓度等级预测方法,特征因子采用太原市2013年-2017年的气象数据、预测站点的PM2.5浓度变化的时间规律以及与周围站点的时空关联性。该方法首先利用K-Means算法对原始气象数据聚类,降低不同分类器之间的相关性,然后利用欠采样方法对数据进行平衡采样,减少类不平衡对分类器性能的影响,最后利用泛化能力好的随机森林构建预测模型。经过真实数据验证,该方法对PM2.5浓度等级预测具有较好的精确度、召回率与F]值。

关 键 词:PM2.5  随机森林  气象因子  欠采样  预测

Prediction of PM_(2.5) Concentration Level Based on Random Forest and Meteorological Parameters
REN Cairong,XIE Gang.Prediction of PM_(2.5) Concentration Level Based on Random Forest and Meteorological Parameters[J].Computer Engineering and Applications,2019,55(2):213-220.
Authors:REN Cairong  XIE Gang
Affiliation:1.College of Information Engineering, Taiyuan University of Technology, Taiyuan 030024, China 2.School of Electronic Information Engineering, Taiyuan University of Science and Technology, Taiyuan 030024, China
Abstract:Not only does air pollution, especially PM2.5, do harm to people’s physical and mental health, but it also restricts the economic development of cities. In order to forecast the concentration level of PM2.5 in a convenient and accurate way, a prediction model of concentration level of PM2.5 based on random forest is proposed, the feature factors adopt the meteorological data of Taiyuan city from 2013 to 2016, the rule of time sequence of PM2.5 concentration change of the prediction site, and its temporal and spatial correlation with the surrounding sites. Firstly, the K-Means algorithm is applied to cluster the raw meteorological data in order to reduce the correlation between different classifiers. Secondly, the undersampling method is used to balance the dataset so as to reduce the impact of class imbalance on the performance of classifiers. Finally, a predictive model is constructed by using random forest with good generalization ability. By the verification of the real data, the method boasts good recall, precision and F-score in the prediction of the concentration level of PM2.5.
Keywords:PM2  5  random forest  meteorological factors  undersampling  prediction  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号