首页 | 本学科首页   官方微博 | 高级检索  
     


DDR: an index method for large time-series datasets
Affiliation:1. School of Information Technology, Faculty of Science and Technology, Deakin University, Melbourne Campus, Burwood, Victoria, Melbourne, 3125, Australia;2. Australia Research Council Centre in Bioinformatics, Melbourne, Australia;3. Institute of Information Sciences and Electronics, University of Tsukuba, 1-1-1, Tennodai, Tsukuba shi, Ibraki ken, Japan
Abstract:The tree index structure is a traditional method for searching similar data in large datasets. It is based on the presupposition that most sub-trees are pruned in the searching process. As a result, the number of page accesses is reduced. However, time-series datasets generally have a very high dimensionality. Because of the so-called dimensionality curse, the pruning effectiveness is reduced in high dimensionality. Consequently, the tree index structure is not a suitable method for time-series datasets. In this paper, we propose a two-phase (filtering and refinement) method for searching time-series datasets. In the filtering step, a quantizing time-series is used to construct a compact file which is scanned for filtering out irrelevant. A small set of candidates is translated to the second step for refinement. In this step, we introduce an effective index compression method named grid-based datawise dimensionality reduction (DRR) which attempts to preserve the characteristics of the time-series. An experimental comparison with existing techniques demonstrates the utility of our approach.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号