On discovering co-location patterns in datasets: a case study of pollutants and child cancers |
| |
Authors: | Jundong Li Aibek Adilmagambetov Mohomed Shazan Mohomed Jabbar Osmar R Zaïane Alvaro Osornio-Vargas Osnat Wine |
| |
Affiliation: | 1.Computer Science and Engineering,Arizona State University,Tempe,USA;2.Department of Computing Science,University of Alberta,Edmonton,Canada;3.Department of Pediatrics,University of Alberta,Edmonton,Canada |
| |
Abstract: | We intend to identify relationships between cancer cases and pollutant emissions by proposing a novel co-location mining algorithm. In this context, we specifically attempt to understand whether there is a relationship between the location of a child diagnosed with cancer with any chemical combinations emitted from various facilities in that particular location. Co-location pattern mining intends to detect sets of spatial features frequently located in close proximity to each other. Most of the previous works in this domain are based on transaction-free apriori-like algorithms which are dependent on user-defined thresholds, and are designed for boolean data points. Due to the absence of a clear notion of transactions, it is nontrivial to use association rule mining techniques to tackle the co-location mining problem. Our proposed approach is focused on a grid based transactionization? of the geographic space, and is designed to mine datasets with extended spatial objects. It is also capable of incorporating uncertainty of the existence of features to model real world scenarios more accurately. We eliminate the necessity of using a global threshold by introducing a statistical test to validate the significance of candidate co-location patterns and rules. Experiments on both synthetic and real datasets reveal that our algorithm can detect a considerable amount of statistically significant co-location patterns. In addition, we explain the data modelling framework which is used on real datasets of pollutants (PRTR/NPRI) and childhood cancer cases. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|