首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
连续最近邻查询方法研究   总被引:3,自引:0,他引:3  
本文分析了目前进行连续最近邻查询的几种方法,并由该问题的几何特征入手,提出了基于R-tree的查询算法,可以避免分割点的丢失和高代价的查询,能够有效地完成移动对象的连续最近邻查询.  相似文献   

2.
Nearest neighbor query is one of the most important operations in spatial databases and their application domains, such as location-based services and advanced traveler information systems. This paper addresses the problem of finding the in-route nearest neighbor (IRNN) for a query object tuple which consists of a given route with a destination and a current location on it. The IRNN is a facility instance via which the detour from the original route on the way to the destination is smallest. This paper addresses four alternative solution methods. Comparisons among them are presented using an experimental framework. Extensive experiments using real road map datasets are conducted to examine the behaviors of the solutions in terms of five parameters affecting the performance. The overall experiments show that our strategy to reduce the expensive path computations to minimize the response time is reasonable. The spatial distance join-based method always shows better performance with fewer path computations compared to the recursive methods. The computation costs for all methods except the precomputed zone-based method increase with increases in the road map size and the query route length but decrease with increases in the facility density. The precomputed zone-based method shows the most efficiency when there are no updates on the road map.  相似文献   

3.
现有基于近邻图的近似最近邻搜索(ANNS)算法通常将数据库中被检索向量组织成近邻图结构,根据用户设定参数搜索查询向量的近似最近邻。为提升基于近邻图的ANNS算法在给定召回率下的搜索效率,提出一种参数自适应方法AdaptNNS。采集数据库中的被检索向量并对采样结果进行聚类,利用聚类中心向量和最近邻分类器提取查询负载特征,同时将查询负载特征与不同的召回率相结合作为输入特征训练梯度提升决策树(GBDT)模型。在查询处理过程中,根据应用程序指定的召回率获取最终输入特征,并通过GBDT模型预测最优搜索参数,提升ANNS算法的吞吐量。在Text-to-Image、DEEP和Turing-ANNS数据集上的实验结果表明,当达到相同的目标召回率时,AdaptNNS方法相比于Baseline方法最多可将DiskANN和HNSW算法的吞吐量提升1.3倍,具有更高的近似最近邻搜索效率。  相似文献   

4.
Given a set of data points P and a query point q in a multidimensional space, reverse nearest neighbor (RNN) query finds data points in P whose nearest neighbors are q. Reverse k-nearest neighbor (RkNN) query (where k ges 1) generalizes RNN query to find data points whose kNNs include q. For RkNN query semantics, q is said to have influence to all those answer data points. The degree of q's influence on a data point p (isin P) is denoted by kappap where q is the kappap-th NN of p. We introduce a new variant of RNN query, namely, ranked reverse nearest neighbor (RRNN) query, that retrieves t data points most influenced by q, i.e., the t data points having the smallest kappa's with respect to q. To answer this RRNN query efficiently, we propose two novel algorithms, kappa-counting and kappa-browsing that are applicable to both monochromatic and bichromatic scenarios and are able to deliver results progressively. Through an extensive performance evaluation, we validate that the two proposed RRNN algorithms are superior to solutions derived from algorithms designed for RkNN query.  相似文献   

5.
多对象的最近邻查询   总被引:7,自引:1,他引:6  
最近邻查询是地理信息系统等领域经常遇到的问题,该文在最近邻查询的基础上提出一种多个对象的最近邻查询,并利用已有的R-树最近邻查询方法实现多个对象的查询,该方法不同于普通的最近邻查询,是一种新的方法,在实际应用中也很有意义。  相似文献   

6.
It is well-known that in order to build a strong ensemble, the component learners should be with high diversity as well as high accuracy. If perturbing the training set can cause significant changes in the component learners constructed,then Bagging can effectively improve accuracy. However, for stable learners such as nearest neighbor classifiers, perturbing the training set can hardly produce diverse component learners, therefore Bagging does not work well. This paper adapts Bagging to nearest neighbor classifiers through injecting randomness to distance metrics. In constructing the component learners, both the training set and the distance metric employed for identifying the neighbors are perturbed. A large scale empirical study reported in this paper shows that the proposed BagInRand algorithm can effectively improve the accuracy of nearest neighbor classifiers.  相似文献   

7.
基于Voronoi图的组最近邻查询   总被引:1,自引:0,他引:1  
组最近邻查询由于涉及多个查询点,因此比传统的最近邻查询更为复杂.充分考虑查询点的分布特征以及它们构成的几何图形的性质和特点,给出组最近邻所应满足的条件及判断组最近邻的理论方法.提出基于Voronoi图的组最近邻查询的VGNN算法,可以精确求解查询点集的最近邻.对于查询点不共线的情况,该算法的查询方式是以一点为中心、向外扩张式的;对于查询点共线的情况,该算法给出搜索范围,限定了参与计算的数据点的个数.给出基于Voronoi图的VTree索引.实验结果表明,基于VTree索引的VGNN算法具有较好的性能,并且当查询点不共线时,其性能具有较高的稳定性.  相似文献   

8.
基于Voronoi图的反向最近邻查询   总被引:1,自引:1,他引:0       下载免费PDF全文
刘润涛  张佳佳 《计算机工程》2009,35(19):81-82,8
为了解决反向最近邻查询问题,利用Voronoi图及数据集中点的凸包进行反向最近邻查询,通过判断查询点与凸包的位置关系,可去除大量的数据点,并且给出在数据点被加入或删除后,对查询点的反向最近邻变化情况的判断方法与算法。为了便于查询,设计相应的空间存储数据结构。比较分析表明,该方法在处理多个查询点的反向最近邻时有一定的优势。  相似文献   

9.
Nearest Neighbor Queries in Shared-Nothing Environments   总被引:2,自引:0,他引:2  
In this paper, we propose an efficient solution to the problem of nearest neighbor query processing in declustered spatial databases. Recently a branch-and-bound nearest neighbor finding (BB-NNF) algorithm has been designed to process nearest neighbor queries in R-trees. However, this algorithm is strictly serial (branch-and-bound oriented) and its performance degrades, during processing of a nearest neighbor query, if applied to a parallel environment, since it does not exploit any kind of parallelization. We develop an efficient query processing strategy for parallel nearest neighbor finding (P-NNF), assuming a shared nothing multi-processor architecture, where the processors communicate via a network. In our method, the relevant sites are activated simultaneously. In order to achieve this goal, statistical information is used. The efficiency measure is the response time of a given query. Experimental results, based on real-life and synthetic datasets, show that the proposed method outperforms the branch-and-bound method by factors.  相似文献   

10.
论文提出一种等和值块扩展最近邻矢量量化码字搜索算法。该算法将码书按和值大小排序分块,并将每一块中间或中间附近的码字的和值作为本码书块的特征和值。编码时,查找与输入矢量和值距离最近的码书块并作为初始匹配码书块。然后在该码书块附近上下扩展搜索相邻码书块中距输入矢量最近的码字。该算法具有无复杂运算的特点,易于VLSI技术实现。仿真结果表明,该算法是一种有效的码字搜索算法。  相似文献   

11.
K-近邻(K-Nearest Neighbors,K-NN)是一种懒惰学习算法,用K-NN对数据分类时,不需要训练分类模型。K-NN算法的优点是思想简单、易于实现;缺点是计算量大,原因是在对测试样例进行分类时,其需要计算测试样例与训练集中每一个训练样例之间的距离。压缩近邻算法(Condensed Nearest Neighbors,CNN)可以克服K-NN算法的不足。但是,在面对大数据集时,由于自身的迭代计算特性,CNN的运算效率会变得非常低。针对这一问题,提出一种名为Spark CNN的压缩近邻算法。在大数据环境下,与基于MapReduce的CNN算法相比,Spark CNN的效率大幅提高,在5个大数据集上的实验证明了这一结论。  相似文献   

12.
针对伪近邻分类算法(LMPNN)对异常点和噪声点仍然敏感的问题,提出了一种基于双向选择的伪近邻算法(BS-PNN)。利用邻近性度量选取[k]个最近邻,让测试样本和近邻样本通过互近邻定义进行双向选择;通过计算每类中互近邻的个数及其局部均值的加权距离,从而得到测试样本到伪近邻的欧氏距离;利用改进的类可信度作为投票度量方式,对测试样本进行分类。BS-PNN算法在处理复杂的分类任务时,具有能够准确识别噪声点,降低近邻个数[k]的敏感性,提高分类精度等优势。在UCI和KEEL的15个实际数据集上进行仿真实验,并与KNN、WKNN、LMKNN、PNN、LMPNN、DNN算法以及P-KNN算法进行比较,实验结果表明,基于双向选择的伪近邻算法的分类性能明显优于其他几种近邻分类算法。  相似文献   

13.
Selective Sampling for Nearest Neighbor Classifiers   总被引:3,自引:0,他引:3  
Most existing inductive learning algorithms work under the assumption that their training examples are already tagged. There are domains, however, where the tagging procedure requires significant computation resources or manual labor. In such cases, it may be beneficial for the learner to be active, intelligently selecting the examples for labeling with the goal of reducing the labeling cost. In this paper we present LSS—a lookahead algorithm for selective sampling of examples for nearest neighbor classifiers. The algorithm is looking for the example with the highest utility, taking its effect on the resulting classifier into account. Computing the expected utility of an example requires estimating the probability of its possible labels. We propose to use the random field model for this estimation. The LSS algorithm was evaluated empirically on seven real and artificial data sets, and its performance was compared to other selective sampling algorithms. The experiments show that the proposed algorithm outperforms other methods in terms of average error rate and stability.  相似文献   

14.
CPM(conceptual partitioning monitoring)是一种较为高效的概念划分网格的思想,用以解决二维空间下的连续最近邻查询问题.在此思想的基础上提出一种采用树形结构来索引概念划分网格的连续最近邻查询算法T-CPM,通过一系列改进步骤,提升了这一算法的查询效率.实验证明,相比经典的算法,T-CPM优化了网格的检索顺序并节省了计算代价.此外,验证了将这一新的方法延伸到基于不确定空间对象的连续最近邻查询问题中,以此给出了一种针对动态不确定空间数据最近邻查询问题的思路和方法.  相似文献   

15.
Li  Xinyu  Hidayat  Arif  Taniar  David  Cheema  Muhammad Aamir 《World Wide Web》2021,24(1):279-296
World Wide Web - Reverse k Nearest Neighbor (RkNN) queries retrieve all objects that consider the query as one of their k most influential objects. Given a set of user U, a set of facilities F and...  相似文献   

16.
连续可见最近邻查询是查询连续空间的最近邻问题,目前的研究基本以二维空间为背景并提出了一些查询算法,但可见性判断方法不能适用于三维或高维空间.以陆地表面的三维数据为研究背景,提出了一种查询地表任意路径的连续可见最近邻方法.该方法以计算步长的方式把整个查询路径分割成若干个连续的查询子路径,循环计算每个子路径的连续可见最近邻直至得到整个路径的查询结果.该方法可以扩展应用于高维空间中的连续最近邻查询.  相似文献   

17.
反向最近邻查询是空间数据库空间查询的研究热点。目前反向最近邻查询的查询粒度都是基于一维的点,在一些空间物体不能抽象为点的情况下将其抽象为点进行反向最近邻查询,查询结果不能达到一定的精度。该文在分析基于平面线段的最近邻查询和R树结构的基础上提出了一种改进的R树—Rcd树,并给出了基于Rcd树的平面线段反向最近邻查询算法,该方法能实现平面线段的反向最近邻查询。  相似文献   

18.
朱庆生  唐汇  冯骥 《计算机科学》2014,41(3):276-278,305
任何涉及k近邻求解问题的算法被应用于处理不同特征的数据集时,参数k值的选择都会明显影响算法的性能和结果。因而,如何选择k近邻算法中敏感参数k值一直是一个研究难点。提出了一种新的近邻关系———自然最近邻,它不需要设置参数k,每个节点的邻居是由算法自适应计算而形成的。针对离群点检测的特殊性,通过确定自然最近邻居搜索算法的终止条件,提出一种基于自然最近邻的新的离群检测算法ODb3N。实验表明,该算法不仅避免了k近邻中参数的选择问题,而且能够更有效地发现离群簇。  相似文献   

19.
将查询点作为Delaunay图的一个生成点,利用Delaunay图的生成点与其邻接生成点之间的关系,在查询点的邻接生成点集(元素个数小于等于6)中计算数据集中给定点的反向最近邻。把伴随Delaunay图增量生成过程产生的Delaunay树作为查询索引结构,该结构能存储Delaunay图,在数据点插入和删除时维护Delaunay图的拓扑结构。  相似文献   

20.
基于数据挖掘的组合近邻模型算法   总被引:3,自引:0,他引:3       下载免费PDF全文
针对数据挖掘的组合模型问题,研究了组合模型的理论和技术,分析了组合理论在近邻法的应用现状,提出了一种通过随机属性子集组合近邻分类器的算法MNN,利用简单的投票方法,通过一个随机的属性子集来组合多重近邻分类器,对多重NN分类器的输出进行组合,MNN方法能有效地改进近邻法的分类精度。MNN方法与NN-E000相比,有两个主要的优点:(1) MNN是一个更简单的方法;(2) MNN不受多类问题的限制。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号