A Bloom filter based semi‐index on q‐grams |
| |
Authors: | Szymon Grabowski Robert Susik Marcin Raniszewski |
| |
Affiliation: | Lodz University of Technology, Institute of Applied Computer Science, Poland |
| |
Abstract: | We present a simple q‐gram based semi‐index, which allows to look for a pattern typically only in a small fraction of text blocks. Several space‐time tradeoffs are presented. Experiments on Pizza & Chili datasets show that our solution is up to three orders of magnitude faster than the Claude et al. (Journal of Discrete Algorithms 2012; 11 :37) semi‐index at a comparable space usage. Moreover, the construction of our data structure is fast and easily parallelizable. Copyright © 2016 John Wiley & Sons, Ltd. |
| |
Keywords: | text indexing q‐grams Bloom filter minimizers |
|