首页 | 本学科首页   官方微博 | 高级检索  
     

k-近邻关系下的空间高效用核模式挖掘
引用本文:罗金,王丽珍,王晓璇,肖清.k-近邻关系下的空间高效用核模式挖掘[J].计算机学报,2022,45(2):354-368.
作者姓名:罗金  王丽珍  王晓璇  肖清
作者单位:云南大学信息学院 昆明 650504
基金项目:国家自然科学基金项目(61966036,61662086);
摘    要:空间数据挖掘旨在从空间数据库中发现和提取有价值的潜在知识.空间co-location(共存)模式挖掘一直以来都是空间数据挖掘领域的重要研究方向之一,其目的 是发现一组频繁邻近出现的空间特征子集,而空间高效用co-location模式挖掘则考虑了特征的效用属性.二者在度量空间实例的邻近关系时一般都需要预先给定一个距离阈值...

关 键 词:空间数据挖掘  空间co-location模式  空间高效用核模式  k-近邻

Mining Spatial High Utility Core Patterns under k-Nearest Neighbors
LUO Jin,WANG Li-Zhen,WANG Xiao-Xuan,XIAO Qing.Mining Spatial High Utility Core Patterns under k-Nearest Neighbors[J].Chinese Journal of Computers,2022,45(2):354-368.
Authors:LUO Jin  WANG Li-Zhen  WANG Xiao-Xuan  XIAO Qing
Affiliation:(School of Information Science and Engineering,Yunnan University,Kunming 650504)
Abstract:Spatial data mining aims to help people discover and extract valuable patterns and knowledge from spatial data sets.Spatial co-location pattern mining has always been one of the important research directions in the field of spatial data mining,intending to find subsets of spatial features that often appear close together,while spatial high utility co-location pattern mining takes into account the utility attributes of the features.When measuring the neighbor relationship between spatial instances,both of the above two mining methods usually require a user setting distance threshold of d,the efficiency and effect of the mining algorithms are greatly affected by the distance threshold d.In particular,such algorithms do not work well on unevenly distributed datasets.In addition,when analyzing the pattern utility in the traditional spatial high utility co-location pattern mining,users are not interested in the utility value of some features in a pattern,which should not be calculated into the pattern utility together.Such as when planning of commercial project around 5A-level scenic spots in China to obtain high-yield returns,the expected income of the project itself should not include the income of the scenic spots.That is the spatial high utility pattern obtained by the traditional spatial high utility co-location pattern mining method is not necessarily reliable.Based on the above problems,this paper introduces the k-nearest neighbor relationship into the spatial high utility co-location pattern mining.While solving the problem of setting the distance threshold,making the neighbor relationship between spatial instances is more objective and reasonable.Further,the concepts of core elements and core patterns are defined in the paper.In order to measure the utility of the proposed core pattern,the core participation instance set of features in a core pattern,the core utility participation ratio of features in a core pattern and the core utility index of a core pattern are formally defined.The problem that some feature utilities should not be included in spatial high utility co-location pattern mining is solved efficiently.Then a general mining framework for spatial high utility core pattern mining under the k-nearest neighbor relationship is proposed,and a basic mining algorithm is designed.In the basic algorithm,a grid-based method is used to replace the traditional brute force method for calculating the distance between spatial instances to obtain the k-nearest neighbor instance set of the instance,and the method named sequence tree is used to replace the fully joined method in the traditional spatial co-location pattern mining to quickly collect candidate patterns.In addition,considering that the utility of the core pattern does not satisfy the downward closure property,this paper presents four pruning strategies to greatly improve the mining efficiency of the basic algorithm.The time complexity,space complexity,completeness and correctness of the algorithm are analyzed.Finally,extensive experimental results on real and synthetic data sets show that the spatial high utility core patterns mined by this paper method are useful,with the pruning optimization algorithm being at least 50%better than the basic algorithm under the same parameter setting.
Keywords:spatial data mining  spatial co-location pattern  spatial high utility core pattern  k-nearest neighbor
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号