首页 | 本学科首页   官方微博 | 高级检索  
     


GAC-GEO: a generic agglomerative clustering framework for geo-referenced datasets
Authors:Rachsuda Jiamthapthaksin  Christoph F Eick  Seungchan Lee
Affiliation:(1) Department of Computer Science, University of Massachusetts-Boston, Boston, MA 02125-3393, USA;(2) Department of Computer Science, University of Houston, Houston, TX 77004, USA;(3) Engineering Technology Department, University of Houston, Houston, TX 77004, USA;(4) Bureau of Economic Geology, John A. & Katherine G. Jackson School of Geosciences, The University of Texas at Austin, Austin, TX, USA
Abstract:Major challenges of clustering geo-referenced data include identifying arbitrarily shaped clusters, properly utilizing spatial information, coping with diverse extrinsic characteristics of clusters and supporting region discovery tasks. The goal of region discovery is to identify interesting regions in geo-referenced datasets based on a domain expert’s notion of interestingness. Almost all agglomerative clustering algorithms only focus on the first challenge. The goal of the proposed work is to develop agglomerative clustering frameworks that deal with all four challenges. In particular, we propose a generic agglomerative clustering framework for geo-referenced datasets (GAC-GEO) generalizing agglomerative clustering by allowing for three plug-in components. GAC-GEO agglomerates neighboring clusters maximizing a plug-in fitness function that capture the notion of interestingness of clusters. It enhances typical agglomerative clustering algorithms in two ways: fitness functions support task-specific clustering, whereas generic neighboring relationships increase the number of merging candidates. We also demonstrate that existing agglomerative clustering algorithms can be considered as specific cases of GAC-GEO. We evaluate the proposed framework on an artificial dataset and two real-world applications involving region discovery. The experimental results show that GAC-GEO is capable of identifying arbitrarily shaped hotspots for different data mining tasks.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号