首页 | 本学科首页   官方微博 | 高级检索  
     


Sparse episode identification in environmental datasets: The case of air quality assessment
Authors:Fani A Tzima  Pericles A Mitkas  Dimitris Voukantsis  Kostas Karatzas
Affiliation:1. School of Information Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Nomi, Ishikawa 923-1292, Japan;2. University of Lyon (ERIC, Lyon 2), 5 Avenue Pierre Mendes-France, 69676 Bron Cedex, France;1. School of Statistics, Dongbei University of Finance and Economics, Dalian 116025, China;2. State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China;3. School of Software, Faculty of Engineering and Information Technology, University of Technology, Sydney, Australia
Abstract:Sparse episode identification in environmental datasets is not only a multi-faceted and computationally challenging problem for machine learning algorithms, but also a difficult task for human-decision makers: the strict regulatory framework, in combination with the public demand for better information services, poses the need for robust, efficient and, more importantly, understandable forecasting models. Additionally, these models need to provide decision-makers with “summarized” and valuable knowledge, that has to be subjected to a thorough evaluation procedure, easily translated to services and/or actions in actual decision making situations, and integratable with existing Environmental Management Systems (EMSs).On this basis, our current study investigates the potential of various machine learning algorithms as tools for air quality (AQ) episode forecasting and assesses them – given the corresponding domain-specific requirements – using an evaluation procedure, tailored to the task at hand. Among the algorithms employed in the experimental phase, our main focus is on ZCS-DM, an evolutionary rule-induction algorithm specifically designed to tackle this class of problems – that is classification problems with skewed class distributions, where cost-sensitive model building is required.Overall, we consider this investigation successful, in terms of its aforementioned goals and constraints: obtained experimental results reveal the potential of rule-based algorithms for urban AQ forecasting, and point towards ZCS-DM as the most suitable algorithm for the target domain, providing the best trade-off between model performance and understandability.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号