首页 | 本学科首页   官方微博 | 高级检索  
     


Cache-oblivious index for approximate string matching
Authors:Wing-Kai Hon  Tak-Wah Lam
Affiliation:
  • a Department of Computer Science, National Tsing Hua University, Taiwan
  • b Department of Computer Science, The University of Hong Kong, Hong Kong
  • c Department of Computer Science, Louisiana State University, LA, USA
  • d Department of Electrical Engineering and Computer Science, Kansas University, KS, USA
  • Abstract:This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P, we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-memory solutions, which are also cache-oblivious. Our first index occupies O((nlogkn)/B) disk pages and finds all k-error matches with O((|P|+occ)/B+logknloglogBn) I/Os, where B denotes the number of words in a disk page. To the best of our knowledge, this index is the first external-memory data structure that does not require Ω(|P|+occ+poly(logn)) I/Os. The second index reduces the space to O((nlogn)/B) disk pages, and the I/O complexity is O((|P|+occ)/B+logk(k+1)nloglogn).
    Keywords:String matching  Indexing  Approximate queries  Cache-oblivious  I/O model
    本文献已被 ScienceDirect 等数据库收录!
    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号