共查询到20条相似文献,搜索用时 0 毫秒
1.
Dong-Ho Lee Hyoung-Joo Kim 《Knowledge and Data Engineering, IEEE Transactions on》2003,15(6):1472-1486
The SPY-TEC (spherical pyramid-technique) was proposed as a new indexing method for high-dimensional data spaces using a special partitioning strategy that divides a d-dimensional data space into 2d spherical pyramids. In the SPY-TEC, an efficient algorithm for processing hyperspherical range queries was introduced with a special partitioning strategy. However, the technique for processing k-nearest-neighbor queries, which are frequently used in similarity search, was not proposed. In this paper, we propose an efficient algorithm for processing nearest-neighbor queries on the SPY-TEC by extending the incremental nearest-neighbor algorithm. We also introduce a metric that can be used to guide an ordered best-first traversal when finding nearest neighbors on the SPY-TEC. Finally, we show that our technique significantly outperforms the related techniques in processing k-nearest-neighbor queries by comparing it to the R*-tree, the X-tree, and the sequential scan through extensive experiments. 相似文献
2.
Moutafis Panagiotis García-García Francisco Mavrommatis George Vassilakopoulos Michael Corral Antonio Iribarne Luis 《Distributed and Parallel Databases》2021,39(3):733-784
Distributed and Parallel Databases - Given two datasets of points (called Query and Training), the Group (K) Nearest-Neighbor (GKNN) query retrieves (K) points of the Training with the smallest sum... 相似文献
3.
Ricardo J. Barrientos José I. Gómez Christian Tenllado Manuel Prieto Matias Mauricio Marin 《Computers & Electrical Engineering》2013
Metric-space similarity search has proven suitable in a number of application domains such as multimedia retrieval and computational biology to name a few. These applications usually work on very large databases that are often indexed to speed-up on-line searches. To achieve efficient throughput, it is essential to exploit the intrinsic parallelism in the respective search query processing algorithms. Many strategies have been proposed in the literature to parallelize these algorithms either on shared or distributed memory multiprocessor systems. Lately, GPUs have been used to implement brute-force parallel search strategies instead of using index data structures. Indexing poses difficulties when it comes to achieve efficient exploitation of GPU resources. In this paper we propose single and multi GPU metric space techniques that efficiently exploit GPU tailored index data structures for parallel similarity search in large databases. The experimental results show that our proposal outperforms previous index-based sequential and OpenMP search strategies. 相似文献
4.
Salzberg S. Delcher A.L. Heath D. Kasif S. 《IEEE transactions on pattern analysis and machine intelligence》1995,17(6):599-608
Proposes a theoretical model for analysis of classification methods, in which the teacher knows the classification algorithm and chooses examples in the best way possible. The authors apply this model using the nearest-neighbor learning algorithm, and develop upper and lower bounds on sample complexity for several different concept classes. For some concept classes, the sample complexity turns out to be exponential even using this best-case model, which implies that the concept class is inherently difficult for the NN algorithm. The authors identify several geometric properties that make learning certain concepts relatively easy. Finally the authors discuss the relation of their work to helpful teacher models, its application to decision tree learning algorithms, and some of its implications for experimental work 相似文献
5.
Locally adaptive metric nearest-neighbor classification 总被引:4,自引:0,他引:4
Domeniconi C. Jing Peng Gunopulos D. 《IEEE transactions on pattern analysis and machine intelligence》2002,24(9):1281-1285
Nearest-neighbor classification assumes locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with finite samples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest-neighbor rule. We propose a locally adaptive nearest-neighbor classification method to try to minimize bias. We use a chi-squared distance analysis to compute a flexible metric for producing neighborhoods that are highly adaptive to query locations. Neighborhoods are elongated along less relevant feature dimensions and constricted along most influential ones. As a result, the class conditional probabilities are smoother in the modified neighborhoods, whereby better classification performance can be achieved. The efficacy of our method is validated and compared against other techniques using both simulated and real-world data 相似文献
6.
The nearest-neighbor multilayer perceptron (NN-MLP) is a single-hidden-layer network suitable for pattern recognition. To design an NN-MLP efficiently, this paper proposes a new evolutionary algorithm consisting of four basic operations: recognition, remembrance, reduction, and review. Experimental results show that this algorithm can produce the smallest or nearly smallest networks from random initial ones. 相似文献
7.
A novel neuralnet-based method of constructing optimized prototypes for nearest-neighbor classifiers is proposed. Based on an effective classification oriented error function containing class classification and class separation components, the corresponding prototype and feature weight update rules are derived. The proposed method consists of several distinguished properties. First, not only prototypes but also feature weights are constructed during the optimization process. Second, several instead of one prototype not belonging to the genuine class of input sample x are updated when x is classified incorrectly. Third, it intrinsically distinguishes different learning contribution from training samples, which enables a large amount of learning from constructive samples, and limited learning from outliers. Experiments have shown the superiority of this method compared with LVQ2 and other previous works. 相似文献
8.
基于最近邻原则的半监督聚类算法 总被引:1,自引:0,他引:1
基于最近邻原则的半监督聚类算法是以基于最近邻的聚类中心求解算法为基础的。在基于最近邻的聚类中心求解算法中,用相似度矩阵记录数据点间的相似程度,由目标函数最小值求得聚类的类中心点。在基于最近邻原则的半监督聚类算法中,根据约束信息来调整相似度矩阵G,数据点间相似度的变化引起了数据点间加权欧式距离的变化,由此更新加权欧式距离矩阵M,最后执行聚类中心求解算法完成聚类。大量实验结果表明,该算法能获得较好的聚类结果。 相似文献
9.
Robert F. Sproull 《Algorithmica》1991,6(1-6):579-589
This note presents a simplification and generalization of an algorithm for searchingk-dimensional trees for nearest neighbors reported by Friedmanet al [3]. If the distance between records is measured usingL 2 , the Euclidean norm, the data structure used by the algorithm to determine the bounds of the search space can be simplified to a single number. Moreover, because distance measurements inL 2 are rotationally invariant, the algorithm can be generalized to allow a partition plane to have an arbitrary orientation, rather than insisting that it be perpendicular to a coordinate axis, as in the original algorithm. When ak-dimensional tree is built, this plane can be found from the principal eigenvector of the covariance matrix of the records to be partitioned. These techniques and others yield variants ofk-dimensional trees customized for specific applications. It is wrong to assume thatk-dimensional trees guarantee that a nearest-neighbor query completes in logarithmic expected time. For smallk, logarithmic behavior is observed on all but tiny trees. However, for largerk, logarithmic behavior is achievable only with extremely large numbers of records. Fork = 16, a search of ak-dimensional tree of 76,000 records examines almost every record. 相似文献
10.
Although multilayer perceptrons have been shown to be adept at providing good solutions to many problems, they have a major drawback in the very large amount of time needed for training (for example, on the order of CPU days for some of the author's experiments). The paper describes a method of producing a reasonable starting point by using a nearest-neighbor classifier. The method is further expanded to provide a method of ;programming' the upper layer of any network assuming the lower layers already exist. 相似文献
11.
Fast nearest-neighbor search in dissimilarity spaces 总被引:3,自引:0,他引:3
Farago A. Linder T. Lugosi G. 《IEEE transactions on pattern analysis and machine intelligence》1993,15(9):957-962
A fast nearest-neighbor algorithm is presented. It works in general spaces in which the known cell techniques cannot be implemented for various reasons, such as the absence of coordinate structure or high dimensionality. The central idea has already appeared several times in the literature with extensive computer simulation results. An exact probabilistic analysis of this family of algorithms that proves its O (1) asymptotic average complexity measured in the number of dissimilarity calculations is presented 相似文献
12.
Stereo matching as a nearest-neighbor problem 总被引:5,自引:0,他引:5
Tomasi C. Manduchi R. 《IEEE transactions on pattern analysis and machine intelligence》1998,20(3):333-340
13.
14.
Adaptive query processing generally involves a feedback loop comprising monitoring, assessment and response. So far, individual proposals have tended to group together an approach to monitoring, a means of assessment, and a form of response. However, there are many benefits in decoupling these three phases, and in constructing generic frameworks for each of them. To this end, this paper discusses monitoring of query plan execution as a topic in its own right, and advocates an approach based on self-monitoring algebraic operators. This approach is shown to be generic and independent of any specific adaptation mechanism, easily implementable and portable, sufficiently comprehensive, appropriate for heterogeneous distributed environments, and more importantly, capable of driving on-the-fly adaptations of query plan execution. An experimental evaluation of the overheads and of the quality of the results obtained by monitoring is also presented. 相似文献
15.
16.
Vivek S. Borkar Mrinal K. Ghosh 《Mathematics of Control, Signals, and Systems (MCSS)》1991,4(1):81-98
The self-tuning approach to adaptive control is applied to a class of Markov chains called nearest-neighbor motions. These
have a countable state space and move from any state to at most finitely many neighboring states. For compact parameter and
control spaces, the almost-sure optimality of the self-tuner for an ergodic cost criterion is established under two sets of
assumptions. 相似文献
17.
18.
The nearest-neighbor rule and the potential-function classifier are nonparametric discrimination methods that require the storage of a set of sample patterns. Here, a relationship between the two methods in terms of subclasses and superclasses is developed. Considering an exponential potential function, necessary and sufficient conditions for identity of their decision surfaces are obtained. Based on these conditions, an algorithm for establishing identity is introduced. 相似文献
19.
A fast algorithm for the nearest-neighbor classifier 总被引:3,自引:0,他引:3
Djouadi A. Bouktache E. 《IEEE transactions on pattern analysis and machine intelligence》1997,19(3):277-282
A fast algorithm that finds the nearest neighbor (NN) of an unknown sample from a design set of labeled samples is proposed. This algorithm requires a quite moderate preprocessing effort and a rather excessive storage, but it accomplishes substantial computational savings during classification. The performance of the algorithm is described and compared to the performance of the conventional one. Results on simulated data are provided to illustrate the computational savings that may be achieved using this fast algorithm 相似文献
20.
There is a great interest in dimensionality reduction techniques for tackling the problem of high-dimensional pattern classification. This paper addresses the topic of supervised learning of a linear dimension reduction mapping suitable for classification problems. The proposed optimization procedure is based on minimizing an estimation of the nearest neighbor classifier error probability, and it learns a linear projection and a small set of prototypes that support the class boundaries. The learned classifier has the property of being very computationally efficient, making the classification much faster than state-of-the-art classifiers, such as SVMs, while having competitive recognition accuracy. The approach has been assessed through a series of experiments, showing a uniformly good behavior, and competitive compared with some recently proposed supervised dimensionality reduction techniques. 相似文献