首页 | 本学科首页   官方微博 | 高级检索  
     

Composite Distance Transformation for Indexing and k-Nearest-Neighbor Searching in High-Dimensional Spaces
作者姓名:Yi Zhuang  Yue-Ting Zhuang  and Fei Wu
作者单位:College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
基金项目:Partially supported by the National Natural Science Foundation of China (Grant No. 60533090), National Science Fund for Distinguished Young Scholars (Grant No. 60525108), the National Grand Fundamental Research 973 Program of China (Grant No. 2002CB312101), Science and Technology Project of Zhejiang Province (Grant Nos. 2005C13032, 2005C11001-05) and China-America Academic Digital Library Project (see www.cadal.zju.edu.cn).
摘    要:Due to the famous dimensionality curse problem, search in a high-dimensional space is considered as a "hard" problem. In this paper, a novel composite distance transformation method, which is called CDT, is proposed to support a fast κ-nearest-neighbor (κ-NN) search in high-dimensional spaces. In CDT, all (n) data points are first grouped into some clusters by a κ-Means clustering algorithm. Then a composite distance key of each data point is computed. Finally, these index keys of such n data points are inserted by a partition-based B^+-tree. Thus, given a query point, its κ-NN search in high-dimensional spaces is transformed into the search in the single dimensional space with the aid of CDT index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of the proposed scheme. Our results show that this method outperforms the state-of-the-art high-dimensional search techniques, such as the X-Tree, VA-file, iDistance and NB-Tree.

关 键 词:计算机网络  高维空间  数据搜索  最近邻域搜索方法
收稿时间:1 May 2006
修稿时间:2006-05-012007-01-04

Composite Distance Transformation for Indexing and <Emphasis Type="Italic">k</Emphasis>-Nearest-Neighbor Searching in High-Dimensional Spaces
Yi Zhuang,Yue-Ting Zhuang,and Fei Wu.Composite Distance Transformation for Indexing and k-Nearest-Neighbor Searching in High-Dimensional Spaces[J].Journal of Computer Science and Technology,2007,22(2):208-217.
Authors:Yi Zhuang  Yue-Ting Zhuang  Fei Wu
Affiliation:College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
Abstract:Due to the famous dimensionality curse problem, search in a high-dimensional space is considered as a “hard” problem. In this paper, a novel composite distance transformation method, which is called CDT, is proposed to support a fast k-nearest-neighbor (k-NN) search in high-dimensional spaces. In CDT, all (n) data points are first grouped into some clusters by a k-Means clustering algorithm. Then a composite distance key of each data point is computed. Finally, these index keys of such n data points are inserted by a partition-based B+-tree. Thus, given a query point, its k-NN search in high-dimensional spaces is transformed into the search in the single dimensional space with the aid of CDT index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of the proposed scheme. Our results show that this method outperforms the state-of-the-art high-dimensional search techniques, such as the X-Tree, VA-file, iDistance and NB-Tree. Electronic supplementary material Electronic supplementary material is available for this article at and accessible for authorised users. Partially supported by the National Natural Science Foundation of China (Grant No. 60533090), National Science Fund for Distinguished Young Scholars (Grant No. 60525108), the National Grand Fundamental Research 973 Program of China (Grant No. 2002CB312101), Science and Technology Project of Zhejiang Province (Grant Nos. 2005C13032, 2005C11001-05) and China-America Academic Digital Library Project (see ).
Keywords:centroid distance            k-nearest-neighbor search  start distance
本文献已被 维普 SpringerLink 等数据库收录!
点击此处可从《计算机科学技术学报》浏览原始摘要信息
点击此处可从《计算机科学技术学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号