共查询到20条相似文献,搜索用时 15 毫秒
1.
Yong-Sheng Chen Author Vitae Yi-Ping Hung Author Vitae Ting-Fang Yen Author Vitae 《Pattern recognition》2007,40(2):360-375
In this paper, we present a fast and versatile algorithm which can rapidly perform a variety of nearest neighbor searches. Efficiency improvement is achieved by utilizing the distance lower bound to avoid the calculation of the distance itself if the lower bound is already larger than the global minimum distance. At the preprocessing stage, the proposed algorithm constructs a lower bound tree (LB-tree) by agglomeratively clustering all the sample points to be searched. Given a query point, the lower bound of its distance to each sample point can be calculated by using the internal node of the LB-tree. To reduce the amount of lower bounds actually calculated, the winner-update search strategy is used for traversing the tree. For further efficiency improvement, data transformation can be applied to the sample and the query points. In addition to finding the nearest neighbor, the proposed algorithm can also (i) provide the k-nearest neighbors progressively; (ii) find the nearest neighbors within a specified distance threshold; and (iii) identify neighbors whose distances to the query are sufficiently close to the minimum distance of the nearest neighbor. Our experiments have shown that the proposed algorithm can save substantial computation, particularly when the distance of the query point to its nearest neighbor is relatively small compared with its distance to most other samples (which is the case for many object recognition problems). 相似文献
2.
Alok Aggarwal 《Information Processing Letters》2004,92(3):121-126
We present an O(nlogn) time divide-and-conquer algorithm for solving the symmetric angle-restricted nearest neighbor (SARNN) problem for a set of n points in the plane under any Lp metric, 1?p?∞. This algorithm is asymptotically optimal (within a multiplicative constant) for any constant p?1. 相似文献
3.
In this paper, a novel center-based nearest neighbor (CNN) classifier is proposed to deal with the pattern classification problems. Unlike nearest feature line (NFL) method, CNN considers the line passing through a sample point with known label and the center of the sample class. This line is called the center-based line (CL). These lines seem to have more capacity of representation for sample classes than the original samples and thus can capture more information. Similar to NFL, CNN is based on the nearest distance from an unknown sample point to a certain CL for classification. As a result, the computation time of CNN can be shortened dramatically with less accuracy decrease when compared with NFL. The performance of CNN is demonstrated in one simulation experiment from computational biology and high classification accuracy has been achieved in the leave-one-out test. The comparisons with nearest neighbor (NN) classifier and NFL classifier indicate that this novel classifier achieves competitive performance. 相似文献
4.
As databases increasingly integrate different types of information such as time-series, multimedia and scientific data, it becomes necessary to support efficient retrieval of multi-dimensional data. Both the dimensionality and the amount of data that needs to be processed are increasing rapidly. As a result of the scale and high dimensional nature, the traditional techniques have proven inadequate. In this paper, we propose search techniques that are effective especially for large high dimensional data sets. We first propose VA+-file technique which is based on scalar quantization of the data. VA+-file is especially useful for searching exact nearest neighbors (NN) in non-uniform high dimensional data sets. We then discuss how to improve the search and make it progressive by allowing some approximations in the query result. We develop a general framework for approximate NN queries, discuss various approaches for progressive processing of similarity queries, and develop a metric for evaluation of such techniques. Finally, a new technique based on clustering is proposed, which merges the benefits of various approaches for progressive similarity searching. Extensive experimental evaluation is performed on several real-life data sets. The evaluation establishes the superiority of the proposed techniques over the existing techniques for high dimensional similarity searching. The techniques proposed in this paper are effective for real-life data sets, which are typically non-uniform, and they are scalable with respect to both dimensionality and size of the data set. 相似文献
5.
6.
A parametrically-directed nearest neighbor procedure is developed to reduce the error between asymptotic and finite sample risk. Two examples are given. 相似文献
7.
Xiaobin MaAuthor Vitae Chengyang ZhangAuthor Vitae 《Data & Knowledge Engineering》2011,70(11):955-983
This paper presents a study of the Multi-Type Reverse Nearest Neighbor (MTRNN) query problem. Traditionally, a reverse nearest neighbor (RNN) query finds all the objects that have the query point as their nearest neighbor. In contrast, an MTRNN query finds all the objects that have the query point in their multi-type nearest neighbors. Existing RNN queries find an influence set by considering only one feature type. However, the influence from multiple feature types is often critical for strategic decision making in many business scenarios, such as site selection for a new shopping center. To that end, we first formalize the notion of the MTRNN query by considering the influence of multiple feature types. We also propose R-tree based algorithms to find the influence set for a given query point and multiple feature types. Finally, experimental results are provided to show the strength of the proposed algorithms as well as design decisions related to performance tuning. 相似文献
8.
Debajyoti Bera 《Information Processing Letters》2011,111(15):723-726
Quantum circuits, which are shallow, limited in the number of gates and additional workspace qubits, are popular for quantum computation because they form the simplest possible model similar to the classical model of a network of Boolean gates and capable of performing non-trivial computation. We give a new lower bound technique for such circuits and use it to give another proof that deterministic computation of the parity function cannot be performed by such circuits. 相似文献
9.
10.
11.
With the recent surge in the use of the location-based service (LBS), the importance of spatial database queries has increased. The reverse nearest neighbor (RNN) search is one of the most popular spatial database queries. In most previous studies, the spatial distance is used for measuring the distance between objects. However, as the demands of users of the LBSs are becoming more complex, considering only the spatial factor as a distance measure is not sufficient. For example, through a hotel finding service, users want to choose a hotel considering not only the spatial distance, but also the non-spatial aspect of the hotel such as the quality which can be represented by the number of stars. Therefore, services that consider both spatial and non-spatial factors in measuring the distance are more useful for users. In such a case, techniques proposed in the previous studies cannot be used since the distance measure is different. In this paper, we propose an efficient method for the RNN search in which a distance measure involves both the spatial distance and the non-spatial aspect of an object. We conduct extensive experiments on a large dataset to evaluate the efficiency of the proposed method. The experimental results show that the proposed method is significantly efficient and scalable. 相似文献
12.
13.
14.
Let S={s1,…,sn} be a set of points in the plane. The Oja depth of a query point θ with respect to S is the sum of the areas of all triangles (θ,si,sj). This depth may be computed in O(nlogn) time in the RAM model of computation. We show that a matching lower bound holds in the algebraic decision tree model. This bound also applies to the computation of the Oja gradient, the Oja sign test, and to the problem of computing the sum of pairwise distances among points on a line. 相似文献
15.
In the graph exploration problem, a searcher explores the whole set of nodes of an unknown graph. We assume that all the unknown graphs are undirected and connected. The searcher is not aware of the existence of an edge until he/she visits one of its endpoints. The searcher's task is to visit all the nodes and go back to the starting node by traveling a tour as short as possible. One of the simplest strategies is the nearest neighbor algorithm (NN), which always chooses the unvisited node nearest to the searcher's current position. The weighted NN (WNN) is an extension of NN, which chooses the next node to visit by using the weighted distance. It is known that WNN with weight 3 is 16-competitive for planar graphs. In this paper we prove that NN achieves the competitive ratio of 1.5 for cycles. In addition, we show that the analysis for the competitive ratio of NN is tight by providing an instance for which the bound of 1.5 is attained, and NN is the best for cycles among WNN with all possible weights. Furthermore, we prove that no online algorithm to explore cycles is better than 1.25-competitive. 相似文献
16.
17.
Palaiahnakote Shivakumara Author Vitae Anjan Dutta Author Vitae 《Pattern recognition》2011,44(8):1671-1683
In the field of multimedia retrieval in video, text frame classification is essential for text detection, event detection, event boundary detection, etc. We propose a new text frame classification method that introduces a combination of wavelet and median moment with k-means clustering to select probable text blocks among 16 equally sized blocks of a video frame. The same feature combination is used with a new Max-Min clustering at the pixel level to choose probable dominant text pixels in the selected probable text blocks. For the probable text pixels, a so-called mutual nearest neighbor based symmetry is explored with a four-quadrant formation centered at the centroid of the probable dominant text pixels to know whether a block is a true text block or not. If a frame produces at least one true text block then it is considered as a text frame otherwise it is a non-text frame. Experimental results on different text and non-text datasets including two public datasets and our own created data show that the proposed method gives promising results in terms of recall and precision at the block and frame levels. Further, we also show how existing text detection methods tend to misclassify non-text frames as text frames in term of recall and precision at both the block and frame levels. 相似文献
18.
Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data 总被引:2,自引:0,他引:2
Xiang Lian Lei Chen 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(3):787-808
Reverse nearest neighbor (RNN) search is very crucial in many real applications. In particular, given a database and a query
object, an RNN query retrieves all the data objects in the database that have the query object as their nearest neighbors.
Often, due to limitation of measurement devices, environmental disturbance, or characteristics of applications (for example,
monitoring moving objects), data obtained from the real world are uncertain (imprecise). Therefore, previous approaches proposed
for answering an RNN query over exact (precise) database cannot be directly applied to the uncertain scenario. In this paper,
we re-define the RNN query in the context of uncertain databases, namely probabilistic reverse nearest neighbor (PRNN) query,
which obtains data objects with probabilities of being RNNs greater than or equal to a user-specified threshold. Since the
retrieval of a PRNN query requires accessing all the objects in the database, which is quite costly, we also propose an effective
pruning method, called geometric pruning (GP), that significantly reduces the PRNN search space yet without introducing any
false dismissals. Furthermore, we present an efficient PRNN query procedure that seamlessly integrates our pruning method.
Extensive experiments have demonstrated the efficiency and effectiveness of our proposed GP-based PRNN query processing approach,
under various experimental settings. 相似文献
19.
20.
This paper presents quasi-optimal upper bounds for simplex range searching. The problem is to preprocess a setP ofn points in d so that, given any query simplexq, the points inP q can be counted or reported efficiently. Ifm units of storage are available (n <m <n
d
), then we show that it is possible to answer any query inO(n
1+/m
1/d
) query time afterO(m
1+) preprocessing. This bound, which holds on a RAM or a pointer machine, is almost tight. We also show how to achieveO(logn) query time at the expense ofO(n
d+) storage for any fixed > 0. To fine-tune our results in the reporting case we also establish new zone theorems for arrangements and merged arrangements of planes in 3-space, which are of independent interest.A preliminary version of this paper has appeared in theProceedings of the Sixth Annual ACM Symposium on Computational Geometry, June 1990, pp. 23–33. Work on this paper by Bernard Chazelle has been supported by NSF Grant CCR-87-00917 and NSF Grant CCR-90-02352. Work on this paper by Micha Sharir has been supported by Office of Naval Research Grant N00014-87-K-0129, by National Science Foundation Grants DCR-83-20085 and CCR-8901484, and by grants from the U.S.-Israeli Binational Science Foundation, the NCRD—the Israeli National Council for Research and Development, and the Fund for Basic Research administered by the Israeli Academy of Sciences. Work by Emo Welzl has been supported by Deutsche Forschungsgemeinschaft Grant We 1265/1–2. Micha Sharir and Emo Welzl have also been supported by a grant from the German-Israeli Binational Science Foundation. Last but not least, all authors thank DIMACS, an NSF Science and Technology Center, for additional support under Grant STC-88-09648. 相似文献