GAC-GEO: a generic agglomerative clustering framework for geo-referenced datasets |
| |
Authors: | Rachsuda Jiamthapthaksin Christoph F Eick Seungchan Lee |
| |
Affiliation: | (1) Department of Computer Science, University of Massachusetts-Boston, Boston, MA 02125-3393, USA;(2) Department of Computer Science, University of Houston, Houston, TX 77004, USA;(3) Engineering Technology Department, University of Houston, Houston, TX 77004, USA;(4) Bureau of Economic Geology, John A. & Katherine G. Jackson School of Geosciences, The University of Texas at Austin, Austin, TX, USA |
| |
Abstract: | Major challenges of clustering geo-referenced data include identifying arbitrarily shaped clusters, properly utilizing spatial
information, coping with diverse extrinsic characteristics of clusters and supporting region discovery tasks. The goal of region discovery is to identify interesting regions in geo-referenced datasets based on a domain expert’s
notion of interestingness. Almost all agglomerative clustering algorithms only focus on the first challenge. The goal of the
proposed work is to develop agglomerative clustering frameworks that deal with all four challenges. In particular, we propose
a generic agglomerative clustering framework for geo-referenced datasets (GAC-GEO) generalizing agglomerative clustering by
allowing for three plug-in components. GAC-GEO agglomerates neighboring clusters maximizing a plug-in fitness function that capture the notion of interestingness of clusters. It enhances typical agglomerative clustering algorithms in two ways:
fitness functions support task-specific clustering, whereas generic neighboring relationships increase the number of merging candidates. We also demonstrate that existing agglomerative clustering algorithms can be considered
as specific cases of GAC-GEO. We evaluate the proposed framework on an artificial dataset and two real-world applications
involving region discovery. The experimental results show that GAC-GEO is capable of identifying arbitrarily shaped hotspots
for different data mining tasks. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|