首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
The Journal of Supercomputing - This study proposes an efficient exact k-flexible aggregate nearest neighbor (k-FANN) search algorithm in road networks using the M-tree. The state-of-the-art...  相似文献   

4.
A simple algorithm for nearest neighbor search in high dimensions   总被引:7,自引:0,他引:7  
The problem of finding the closest point in high-dimensional spaces is common in pattern recognition. Unfortunately, the complexity of most existing search algorithms, such as k-d tree and R-tree, grows exponentially with dimension, making them impractical for dimensionality above 15. In nearly all applications, the closest point is of interest only if it lies within a user-specified distance ϵ. We present a simple and practical algorithm to efficiently search for the nearest neighbor within Euclidean distance ϵ. The use of projection search combined with a novel data structure dramatically improves performance in high dimensions. A complexity analysis is presented which helps to automatically determine ϵ in structured problems. A comprehensive set of benchmarks clearly shows the superiority of the proposed algorithm for a variety of structured and unstructured search problems. Object recognition is demonstrated as an example application. The simplicity of the algorithm makes it possible to construct an inexpensive hardware search engine which can be 100 times faster than its software equivalent. A C++ implementation of our algorithm is available  相似文献   

5.
针对区间值数据的数据聚类问题,根据可拓学关联函数的定义,提出可拓距离的概念来度量数据之间的距离,利用K近邻的思想,根据可拓距离的大小对数据集的目标属性进行投票选择进行分类,设计了可拓K近邻算法(Extension K Nearest Neighbor,EKNN)。最后利用UCI的两个基准数据集Iris植物样本数据和糖尿病数据库PIDD进行验证,首先通过免疫网络约简算法对条件属性进行最小属性约简,然后利用EKNN算法分析和比较不同最小约简属性下的分类准确率。  相似文献   

6.
针对移动机器人工作环境范围复杂时,使用传统概率路线图(PRM)算法非常耗时的问题,提出一种改进的PRM算法.PRM算法最耗时的部分是构建无向路径图,构建无向路径图的关键是近邻搜索.通过使用近似最近邻搜索中的局部敏感哈希算法代替原先最近邻搜索算法,在不降低生成路线图质量的前提下,加快无向路线图的构建速度,减少PRM算法的运行时间.仿真结果表明,改进的PRM算法相较于传统的PRM算法在无向路径图建立时间上减少27.36% ~33.27%,使PRM算法效率大大提高.  相似文献   

7.
8.
With the recent surge in the use of the location-based service (LBS), the importance of spatial database queries has increased. The reverse nearest neighbor (RNN) search is one of the most popular spatial database queries. In most previous studies, the spatial distance is used for measuring the distance between objects. However, as the demands of users of the LBSs are becoming more complex, considering only the spatial factor as a distance measure is not sufficient. For example, through a hotel finding service, users want to choose a hotel considering not only the spatial distance, but also the non-spatial aspect of the hotel such as the quality which can be represented by the number of stars. Therefore, services that consider both spatial and non-spatial factors in measuring the distance are more useful for users. In such a case, techniques proposed in the previous studies cannot be used since the distance measure is different. In this paper, we propose an efficient method for the RNN search in which a distance measure involves both the spatial distance and the non-spatial aspect of an object. We conduct extensive experiments on a large dataset to evaluate the efficiency of the proposed method. The experimental results show that the proposed method is significantly efficient and scalable.  相似文献   

9.
The k nearest neighbor is a lazy learning algorithm that is inefficient in the classification phase because it needs to compare the query sample with all training samples. A template reduction method is recently proposed that uses only samples near the decision boundary for classification and removes those far from the decision boundary. However, when class distributions overlap, more border samples are retrained and it leads to inefficient performance in the classification phase. Because the number of reduced samples are limited, using an appropriate feature reduction method seems a logical choice to improve classification time. This paper proposes a new prototype reduction method for the k nearest neighbor algorithm, and it is based on template reduction and ViSOM. The potential property of ViSOM is displaying the topology of data on a two-dimensional feature map, it provides an intuitive way for users to observe and analyze data. An efficient classification framework is then presented, which combines the feature reduction method and the prototype selection algorithm. It needs a very small data size for classification while keeping recognition rate. In the experiments, both of synthetic and real datasets are used to evaluate the performance. Experimental results demonstrate that the proposed method obtains above 70 % speedup ratio and 90 % compression ratio while maintaining similar performance to kNN.  相似文献   

10.
基于SURF和快速近似最近邻搜索的图像匹配算法   总被引:3,自引:1,他引:3  
针对高维特征向量存在的最近邻匹配正确率低的问题, 提出了一种基于SURF和快速近似最近邻搜索的图像匹配算法。首先用Fast-Hessian 检测子进行特征点检测, 并生成SURF特征描述向量; 然后通过快速近似最近邻搜索算法得到初匹配点对, 再对得出的单向匹配结果进行双向匹配; 最后采用鲁棒性较好的PROSAC算法进一步剔除误匹配点对。实验证明了该算法不仅提高了SURF算法匹配的正确率, 还保证了算法的实时性。  相似文献   

11.
Cluster analysis plays an important role in identifying the natural structure of the target dataset. It has been widely used in many fields, such as pattern recognition, machine learning, image segmentation, document clustering and so on. There are many different methods to conduct cluster analysis. Namely, most real datasets are non-spherical and have complex shapes. Although these methods are widely used to deal with clustering tasks, they are susceptible to noise and arbitrary shapes. Thus, we propose a novel clustering algorithm (called RNN-NSDC) in this paper, which is based on the natural reverse nearest neighbor structure. Firstly, we apply the reverse nearest neighbors in the algorithm to extract core objects. Secondly, our algorithm uses the neighbor structure information of core objects to cluster. And excluding noise effects, core sets can well represent the structure of clusters. Therefore, the RNN-NSDC can obtain the optimal cluster numbers for the datasets which contain clusters of outliers and arbitrary shapes. To verify the efficiency and accuracy of the RNN-NSDC, synthetic datasets and real datasets are used for experiments. The results indicate the superiority of the RNN-NSDC compared with K-means, DBSCAN, DPC, SNNDPC, DCore and NaNLORE.  相似文献   

12.
In this paper, we present a fast and versatile algorithm which can rapidly perform a variety of nearest neighbor searches. Efficiency improvement is achieved by utilizing the distance lower bound to avoid the calculation of the distance itself if the lower bound is already larger than the global minimum distance. At the preprocessing stage, the proposed algorithm constructs a lower bound tree (LB-tree) by agglomeratively clustering all the sample points to be searched. Given a query point, the lower bound of its distance to each sample point can be calculated by using the internal node of the LB-tree. To reduce the amount of lower bounds actually calculated, the winner-update search strategy is used for traversing the tree. For further efficiency improvement, data transformation can be applied to the sample and the query points. In addition to finding the nearest neighbor, the proposed algorithm can also (i) provide the k-nearest neighbors progressively; (ii) find the nearest neighbors within a specified distance threshold; and (iii) identify neighbors whose distances to the query are sufficiently close to the minimum distance of the nearest neighbor. Our experiments have shown that the proposed algorithm can save substantial computation, particularly when the distance of the query point to its nearest neighbor is relatively small compared with its distance to most other samples (which is the case for many object recognition problems).  相似文献   

13.
目的 海量图像检索技术是计算机视觉领域研究热点之一,一个基本的思路是对数据库中所有图像提取特征,然后定义特征相似性度量,进行近邻检索。海量图像检索技术,关键的是设计满足存储需求和效率的近邻检索算法。为了提高图像视觉特征的近似表示精度和降低图像视觉特征的存储空间需求,提出了一种多索引加法量化方法。方法 由于线性搜索算法复杂度高,而且为了满足检索的实时性,需把图像描述符存储在内存中,不能满足大规模检索系统的需求。基于非线性检索的优越性,本文对非穷尽搜索的多索引结构和量化编码进行了探索新研究。利用多索引结构将原始数据空间划分成多个子空间,把每个子空间数据项分配到不同的倒排列表中,然后使用压缩编码的加法量化方法编码倒排列表中的残差数据项,进一步减少对原始空间的量化损失。在近邻检索时采用非穷尽搜索的策略,只在少数倒排列表中检索近邻项,可以大大减少检索时间成本,而且检索过程中不用存储原始数据,只需存储数据集中每个数据项在加法量化码书中的码字索引,大大减少内存消耗。结果 为了验证算法的有效性,在3个数据集SIFT、GIST、MNIST上进行测试,召回率相比近几年算法提升4%~15%,平均查准率提高12%左右,检索时间与最快的算法持平。结论 本文提出的多索引加法量化编码算法,有效改善了图像视觉特征的近似表示精度和存储空间需求,并提升了在大规模数据集的检索准确率和召回率。本文算法主要针对特征进行近邻检索,适用于海量图像以及其他多媒体数据的近邻检索。  相似文献   

14.
As one of the most important techniques in data mining, cluster analysis has attracted more and more attentions in this big data era. Most clustering algorithms have encountered with challenges including cluster centers determination difficulty, low clustering accuracy, uneven clustering efficiency of different data sets and sensible parameter dependence. Aiming at clustering center determination difficulty and parameter dependence, a novel cluster center fast determination clustering algorithm was proposed in this paper. It is supposed that clustering centers are those data points with higher density and larger distance from other data points of higher density. Normal distribution curves are designed to fit the density distribution curve of density distance product. And the singular points outside the confidence interval by setting the confidence interval are proved to be clustering centers by theory analysis and simulations. Finally, according to these clustering centers, a time scan clustering is designed for the rest of the points by density to complete the clustering. Density radius is a sensible parameter in calculating density for each data point, mountain climbing algorithm is thus used to realize self-adaptive density radius. Abundant typical benchmark data sets are testified to evaluate the performance of the brought up algorithms compared with other clustering algorithms in both aspects of clustering quality and time complexity.  相似文献   

15.
The problem of k nearest neighbors (kNN) is to find the nearest k neighbors for a query point from a given data set. In this paper, a novel fast kNN search method using an orthogonal search tree is proposed. The proposed method creates an orthogonal search tree for a data set using an orthonormal basis evaluated from the data set. To find the kNN for a query point from the data set, projection values of the query point onto orthogonal vectors in the orthonormal basis and a node elimination inequality are applied for pruning unlikely nodes. For a node, which cannot be deleted, a point elimination inequality is further used to reject impossible data points. Experimental results show that the proposed method has good performance on finding kNN for query points and always requires less computation time than available kNN search algorithms, especially for a data set with a big number of data points or a large standard deviation.  相似文献   

16.
Maximizing bichromatic reverse nearest neighbor (MaxBRNN) is a variant of bichromatic reverse nearest neighbor (BRNN). The purpose of the MaxBRNN problem is to find an optimal region that maximizes the size of BRNNs. This problem has lots of real applications such as location planning and profile-based marketing. The best-known algorithm for the MaxBRNN problem is called MaxOverlap. In this paper, we study the MaxBRNN problem and propose a new approach called MaxSegment for a two-dimensional space when the $L_2$ -norm is used. Then, we extend our algorithm to other variations of the MaxBRNN problem such as the MaxBRNN problem with other metric spaces, and a three-dimensional space. Finally, we conducted experiments on real and synthetic datasets to compare our proposed algorithm with existing algorithms. The experimental results verify the efficiency of our proposed approach.  相似文献   

17.
为进一步提高预测精度, 修改候选解间原始Pareto支配性关系, 提出了d-Pareto支配性最近邻预测方法。结合多目标优化的自身特点, 给出了d-Pareto支配性最近邻预测框架, 并论证了d-Pareto支配性预测比Pareto支配性预测具有低平均预测错误率。同时也初步研究了d-Pareto支配性预测与多目标进化算法的交互作用。对几个经典多目标优化问题进行实验, 仿真结果表明d-Pareto支配性预测具有一定的可行性和有效性。  相似文献   

18.
Pattern Analysis and Applications - Nearest neighbor search is a powerful abstraction for data access; however, data indexing is troublesome even for approximate indexes. For intrinsically...  相似文献   

19.
20.
We propose a new scheme to implement gate operations in a one dimensional linear nearest neighbor array, by using dynamic learning algorithm. This is accomplished by training quantum system using a back propagation technique, to find the system parameters that implement gate operations directly. The key feature of our scheme is that, we can reduce the computational overhead of a quantum circuit by finding the parameters to implement the desired gate operation directly, without decomposing them into a sequence of elementary gate operations. We show how the training algorithm can be used as a tool for finding the parameters for implementing controlled-NOT (CNOT) and Toffoli gates between next-to-nearest neighbor qubits in an Ising-coupled linear nearest neighbor system. We then show how the scheme can be used to find parameters for realizing swap gates first, between two adjacent qubits and then, between two next-to-nearest-neighbor qubits, in each case without decomposing it into 3 CNOT gates. Finally, we show how the scheme can be extended to systems with non-diagonal interactions. To demonstrate, we train a quantum system with Heisenberg interactions to find the parameters to realize a swap operation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号