共查询到20条相似文献,搜索用时 15 毫秒
1.
Muhammad Aamir Cheema Wenjie Zhang Xuemin Lin Ying Zhang 《The VLDB Journal The International Journal on Very Large Data Bases》2012,21(5):703-728
Given a set of objects and a query q, a point p is called the reverse k nearest neighbor (RkNN) of q if q is one of the k closest objects of p. In this paper, we introduce the concept of influence zone that is the area such that every point inside this area is the RkNN of q and every point outside this area is not the RkNN. The influence zone has several applications in location-based services, marketing and decision support systems. It can also be used to efficiently process RkNN queries. First, we present efficient algorithm to compute the influence zone. Then, based on the influence zone, we present efficient algorithms to process RkNN queries that significantly outperform existing best-known techniques for both the snapshot and continuous RkNN queries. We also present a detailed theoretical analysis to analyze the area of the influence zone and IO costs of our RkNN processing algorithms. Our experiments demonstrate the accuracy of our theoretical analysis. This paper is an extended version of our previous work (Cheema et?al. in Proceedings of ICDE, pp. 577–588, 2011). We make the following new contributions in this extended version: (1) we conduct a rigorous complexity analysis and show that the complexity of one of our proposed algorithms in Cheema et?al. (Proceedings of ICDE, pp. 577–588, 2011) can be reduced from O(m 2) to O( km) where m?>?k is the number of objects used to compute the influence zone, (2) we show that our techniques can be applied to dimensionality higher than two, and (3) we present efficient techniques to handle data updates. 相似文献
2.
Skyline queries are used with data extensive applications, such as mobile location-based services, to support multi-criteria
decision-making and to prune the data space by returning the most “interesting” data points. Most interesting data points
are the points, which are not dominated by any other point. Spatial network skyline query is a subset of the skyline query
problem where data points are nodes in a road network and the attributes of the data points are network distance relative
to a set of query points. Spatial network skyline query’s problem is the need to calculate the attributes with an expensive
distance calculation operation. Previous works (Deng et al. Proceedings of the 23th international conference on data engineering,
796–805, 2007), Sharifzadeh et al. Proceedings of the 32nd international conference on very large databases, 751–762, 2009)
that addressed this problem involved extensive network distance calculation between the query points and data points. A new
algorithm that requires a remarkably less number of network distance calculations is proposed in this work. Our approach uses
a progressive nearest neighbor algorithm to minimize the set of candidates then evaluates those candidates by only comparing
them to a subset of discovered skyline points. Experiments showed the effectiveness of our algorithm compared to previous
works. 相似文献
3.
反向最近邻查询已成为空间查询的热点问题,而障碍物在实际应用中是不可避免的,因而在障碍物环境中的反向最近邻查询也成为重要的空间查询。已有的可视反向最近邻查询只考虑了可视性,并没有考虑最小障碍距离。提出一种障碍物环境中新的反向最近邻查询的变体,查找障碍距离最小的反向最近邻,即障碍反向最近邻查询。利用障碍距离的计算和相应的剪枝规则,给出障碍反向最近邻查询的算法及相关定理和证明。 相似文献
4.
在充分认识到k阶Voronoi图在解决连续k个近邻查询优越性和现实不可行性的基础上,用分支限界的思想去界定预创建Voronoi图生成点范围的上界,提出了一种动态地创建局部Voronoi图的办法解决连续近邻查询问题。该方法只是在给定查询段上所有点的k个近邻范围上界内创建一个局部的k阶Voronoi图,这样大大降低了基于Voronoi图的连续k近邻查询的代价。 相似文献
5.
Travel planning and recommendation have received significant attention in recent years. In this light, we study a novel problem of discovering probabilistic nearest neighbors and planning the corresponding travel routes in traffic-aware spatial networks (TANN queries) to avoid potential time delay/traffic congestions. We propose and study four novel probabilistic TANN queries. Thereinto two queries target at minimizing the travel time, including a congestion-probability threshold query, and a time-delay threshold query, while another two travel-time threshold queries target at minimizing the potential time delay/traffic congestion. We believe that TANN queries are useful in many real applications, such as discovering nearby points of interest and planning convenient travel routes for users, and location based services in general. The TANN queries are challenged by two difficulties: (1) how to define probabilistic metrics for nearest neighbor queries in traffic-aware spatial networks, and (2) how to process these TANN queries efficiently under different query settings. To overcome these challenges, we define a series of new probabilistic metrics and develop four efficient algorithms to compute the TANN queries. The performances of TANN queries are verified by extensive experiments on real and synthetic spatial data. 相似文献
6.
7.
k-nearest neighbor (k-NN) queries are well-known and widely used in a plethora of applications. However, in the original definition of k-NN queries there is no concern regarding diversity of the answer set with respect to the user’s interests. For instance, travelers may be looking for touristic sites that are close to where they are, but that would also lead them to see different parts of the city. Likewise, if one is looking for restaurants close by, it may be more interesting to learn about restaurants of different categories or ethnicities which are nonetheless relatively close. The interesting novel aspect of this type of query is that there are two competing criteria to be optimized: closeness and diversity. We propose two approaches that leverage the notion of linear skyline queries in order to find the k diverse nearest neighbors within a radius r from a given query point, or (k, r)-DNNs for short. Our proposed approaches return a relatively small set containing all optimal solutions for any linear combination of the weights a user could give to the two competing criteria, and we consider three different notions of diversity: spatial, categorical and angular. Our experiments, varying a number of parameters and exploring synthetic and real datasets, in both Euclidean space and road networks, respectively, show that our approaches are several orders of magnitude faster than a straightforward approach. 相似文献
8.
Recently, Reverse k Nearest Neighbors (RkNN) queries, returning every answer for which the query is one of its k nearest neighbors, have been extensively studied on the database research community. But the RkNN query cannot retrieve spatio-textual objects which are described by their spatial location and a set of keywords. Therefore, researchers proposed a RSTkNN query to find these objects, taking both spatial and textual similarity into consideration. However, the RSTkNN query cannot control the size of answer set and to be sorted according to the degree of influence on the query. In this paper, we propose a new problem Ranked Reverse Boolean Spatial Keyword Nearest Neighbors query called Ranked-RBSKNN query, which considers both spatial similarity and textual relevance, and returns t answers with most degree of influence. We propose a separate index and a hybrid index to process such queries efficiently. Experimental results on different real-world and synthetic datasets show that our approaches achieve better performance. 相似文献
9.
针对基于道路网络的多用户连续k近邻查询处理,提出了一种可伸缩的多用户连续查询处理(scalable processing of multiple continuous queries,SPMCQ)框架.SPMCQ框架采用流水线处理策略,将连续k近邻查询执行分解为可同时作业的预处理、查询执行和结果分发3个阶段,利用多线程技术提高查询处理的并行性.基于SPMCO框架,分别利用基于内存的哈希表和线性链表结构对移动对象位置和道路网络有向图模型进行存储和管理,提出了多连续k近邻查询处理SCkNN算法.实验结果表明,在处理多用户连续k近邻查询时,该算法性能优于目前的道路网络连续k近邻查询处理算法. 相似文献
10.
针对基于k近邻的协同过滤推荐算法中存在的评分特征数据维度过高、k近邻查找速度慢,以及评分冷启动等问题,提出基于数据降维与精确欧氏局部敏感哈希(E2LSH)的k近邻协同过滤推荐算法。首先,融合评分数据、用户属性数据以及项目类别数据,将融合后的数据作为输入对堆叠降噪自编码(SDA)神经网络进行训练,取神经网络编码部分最后一个隐层的值作为输入数据的特征编码,完成非线性降维。然后,利用精确欧氏局部敏感哈希算法对降维后的数据建立索引,通过检索得到目标用户或目标项目的相似近邻。最后,计算目标与近邻之间的相似度,利用相似度对近邻的评分记录加权得到目标用户对目标项目的预测评分。在标准数据集上的实验结果表明,在冷启动场景下,均方根误差比基于局部敏感哈希的推荐算法(LSH-ICF)平均降低了约7.2%,平均运行时间和LSH-ICF相当。表明该方法在保证推荐效率的前提下,缓解了评分冷启动问题。 相似文献
11.
Continuous aggregate nearest neighbor queries 总被引:1,自引:0,他引:1
This paper addresses the problem of continuous aggregate nearest-neighbor (CANN) queries for moving objects in spatio-temporal data stream management systems. A CANN query specifies a set of landmarks, an integer k, and an aggregate distance function f (e.g., min, max, or sum), where f computes the aggregate distance between a moving object and each of the landmarks. The answer to this continuous query is the set of k moving objects that have the smallest aggregate distance f. A CANN query may also be viewed as a combined set of nearest neighbor queries. We introduce several algorithms to continuously and incrementally answer CANN queries. Extensive experimentation shows that the proposed operators outperform the state-of-the-art algorithms by up to a factor of 3 and incur low memory overhead. 相似文献
12.
赵传钢 《计算机工程与应用》2011,47(28):173-175
已有的关于k近邻测度学习算法的工作主要集中于纯区分模型。在假定隐含的生成模型已知的情况下,提出了一种通过分析样本的k个近邻点的概率密度学习测度的方法。实验表明,这种基于类的生成模型假设学习到的局部测度可以有效改善kNN区分模型的性能。 相似文献
13.
针对批次过程非线性、多模态等特征,提出一种基于判别核主元k近邻(Dis-kPCkNN)的故障检测方法。首先,在核主元分析(kPCA)中,高斯核的窗宽参数依据样本类别标签在类内窗宽和类间窗宽中判别选取,使得核矩阵能有效提取数据的关联特征,保持数据的类别信息;其次,在核主元空间中引用k近邻规则代替传统的T2统计方法,k近邻规则可以有效处理主元空间非线性和多模态等特征的故障检测问题。数值模拟实例和半导体蚀刻工艺过程仿真实验表明:基于判别核主元k近邻方法可以有效地处理具有非线性和多模态结构特征的故障检测问题,提高计算的效率,减少内存的占用,并且故障检测率明显优于传统方法。 相似文献
14.
The use of Voronoi diagram has traditionally been applied to computational geometry and multimedia problems. In this paper,
we will show how Voronoi diagram can be applied to spatial query processing, and in particular to Reverse Nearest Neighbor (RNN) queries. Spatial and geographical query processing, in general, and RNN in particular, are becoming more important,
as online maps are now widely available. In this paper, using the concept of Voronoi diagram, we classify RNN into four types
depending on whether the query point and the interest objects are the generator points of the Voronoi Polygon or not. Our approach is based on manipulating Network Voronoi Diagram properties and applying a progressive incremental network expansion for finding the polygon inner network distances required
to solve RNN queries. Our experimentation results show that our approaches have good response times in answering RNN queries. 相似文献
15.
16.
Aggregate nearest neighbor queries in road networks 总被引:5,自引:0,他引:5
Yiu M.L. Mamoulis N. Papadias D. 《Knowledge and Data Engineering, IEEE Transactions on》2005,17(6):820-833
Aggregate nearest neighbor queries return the object that minimizes an aggregate distance function with respect to a set of query points. Consider, for example, several users at specific locations (query points) that want to find the restaurant (data point), which leads to the minimum sum of distances that they have to travel in order to meet. We study the processing of such queries for the case where the position and accessibility of spatial objects are constrained by spatial (e.g., road) networks. We consider alternative aggregate functions and techniques that utilize Euclidean distance bounds, spatial access methods, and/or network distance materialization structures. Our algorithms are experimentally evaluated with synthetic and real data. The results show that their relative performance depends on the problem characteristics. 相似文献
17.
为了提高FD-kNN针对潜隐变量在非线性和多模态过程中的故障检测能力,提出一种基于方差最大化旋转变换的k近邻故障检测与诊断策略。首先,通过方差最大化方法建立旋转变换将原始数据变换到新的正交空间;接下来在该正交空间中执行FD-kNN方法进行故障检测;最后,结合贡献图方法给出基于贡献图的故障诊断策略。通过一个非线性模拟实例,证明方法对潜隐变量故障诊断是有效的;同时,在典型非线性工业过程田纳西过程进行测试,与PCA、FD-kNN和PC-kNN等方法进行对比,实验结果进一步证明了方法的有效性。 相似文献
18.
董洪伟 《计算机工程与应用》2007,43(21):52-56
给定一个度量空间中的一组数据点集,k邻域问题在于对于某个数据点求出按照该空间的距离度量离数据点最近的k个数据样本。目前主要有2种方法,一种是基于立方体分割形成的三维立方体体素索引数组的体素栅格(CG(Cell Grid)方法,另一种方法是基于树索引结构的方法如kd-Tree等。论文主要研究经典CG方法及解决其内存消耗过多问题的两个改进方法:排序体素栅格(SCG)方法和投影体素栅格(PCG)方法。CG、SCG、PCG算法采用了改进的搜索方法,避免了传统CG算法[2-4]可能得到错误k邻域的问题。对三种算法的时空性能进行了分析比较,给出了相应的实验比较数据。 相似文献
19.
空间数据库中反向最近邻查询在低维查询时一般利用基于R-Tree的改进树作为索引结构,由于树型索引结构本身的限制,R-Tree等索引结构的查询在高维中都会出现维数灾难。针对这个问题,提出了一种基于VARdnn-Tree的索引结构,采用量化压缩的方法存储数据,能够有效地支持高维查询。 相似文献
20.
Nearest and reverse nearest neighbor queries for moving objects 总被引:4,自引:0,他引:4
Rimantas Benetis Christian S. Jensen Gytis Karĉiauskas Simonas Ŝaltenis 《The VLDB Journal The International Journal on Very Large Data Bases》2006,15(3):229-249
With the continued proliferation of wireless communications and advances in positioning technologies, algorithms for efficiently
answering queries about large populations of moving objects are gaining interest. This paper proposes algorithms for k nearest and reverse k nearest neighbor queries on the current and anticipated future positions of points moving continuously in the plane. The
former type of query returns k objects nearest to a query object for each time point during a time interval, while the latter returns the objects that have
a specified query object as one of their k closest neighbors, again for each time point during a time interval. In addition, algorithms for so-called persistent and
continuous variants of these queries are provided. The algorithms are based on the indexing of object positions represented
as linear functions of time. The results of empirical performance experiments are reported. 相似文献