共查询到20条相似文献,搜索用时 15 毫秒
1.
Y. Nekrich 《Algorithmica》2007,49(2):94-108
In this paper we present new space efficient dynamic data structures for orthogonal range reporting. The described data structures
support planar range reporting queries in time O(log n+klog log (4n/(k+1))) and space O(nlog log n), or in time O(log n+k) and space O(nlog
ε
n) for any ε>0. Both data structures can be constructed in O(nlog n) time and support insert and delete operations in amortized time O(log 2
n) and O(log nlog log n) respectively. These results match the corresponding upper space bounds of Chazelle (SIAM J. Comput. 17, 427–462, 1988) for the static case.
We also present a dynamic data structure for d-dimensional range reporting with search time O(log
d−1
n+k), update time O(log
d
n), and space O(nlog
d−2+ε
n) for any ε>0.
The model of computation used in our paper is a unit cost RAM with word size log n.
A preliminary version of this paper appeared in the Proceedings of the 21st Annual ACM Symposium on Computational Geometry
2005.
Work partially supported by IST grant 14036 (RAND-APX). 相似文献
2.
3.
The “Common Substring Alignment” problem is defined as follows. The input consists of a set of strings S1,S2…,Sc, with a common substring appearing at least once in each of them, and a target string T. The goal is to compute similarity of all strings Si with T, without computing the part of the common substring over and over again.In this paper we consider the Common Substring Alignment problem for the LCS (Longest Common Subsequence) similarity metric. Our algorithm gains its efficiency by exploiting the sparsity inherent to the LCS problem. Let Y be the common substring, n be the size of the compared sequences, Ly be the length of the LCS of T and Y, denoted |LCS[T,Y]|, and L be max{|LCS[T,Si]|}. Our algorithm consists of an O(nLy) time encoding stage that is executed once per common substring, and an O(L) time alignment stage that is executed once for each appearance of the common substring in each source string. The additional running time depends only on the length of the parts of the strings that are not in any common substring. 相似文献
4.
A central question in computational biology is the design of genetic
markers to distinguish between two given sets of (DNA) sequences. This question is formalized as
the NP-complete Distinguishing Substring Selection problem (DSSS for short) which asks, given a set of "good" strings and
a set of "bad" strings, for a solution string which is, with respect to the Hamming metric, "away" from the good strings and
"close" to the bad strings. More precisely, given integers dg, db, and L, we ask for a length-L string s such that s has Hamming distance at least dg to every length-L substring of the good strings and such that every bad string has a length-L substring with Hamming distance
at most db to s. Studying the parameterized complexity of DSSS, we show that, already for binary alphabet, DSSS is W[1]-hard with respect
to its natural parameters. This, in particular, implies that a recently given polynomial-time approximation scheme (PTAS)
by Deng et al. cannot be replaced by a so-called efficient polynomial-time approximation scheme (EPTAS) unless an unlikely
collapse in parameterized complexity theory occurs. This is seemingly the first computational biology problem for which such
a border between PTAS (which exists) and EPTAS (which is unlikely to exist) could be established. By way of contrast, for
a special case of DSSS, we present an exact fixed-parameter algorithm solving the problem efficiently. In this way we also
exhibit a sharp border between fixed-parameter tractability and intractability results. 相似文献
5.
《IEEE transactions on pattern analysis and machine intelligence》1983,(3):365-370
Let T(U) be the set of words in the dictionary H which contains U as a substring. The problem considered here is the estimation of the set T(U) when U is not known, but Y, a noisy version of U is available. The suggested set estimate S*(Y) of T(U) is a proper subset of H such that its every element contains at least one substring which resembles Y most according to the Levenshtein metric. The proposed algorithm for-the computation of S*(Y) requires cubic time. The algorithm uses the recursively computable dissimilarity measure Dk(X, Y), termed as the kth distance between two strings X and Y which is a dissimilarity measure between Y and a certain subset of the set of contiguous substrings of X. Another estimate of T(U), namely SM(Y) is also suggested. The accuracy of SM(Y) is only slightly less than that of S*(Y), but the computation time of SM(Y) is substantially less than that of S*(Y). Experimental results involving 1900 noisy substrings and dictionaries which are subsets of 1023 most common English words [11] indicate that the accuracy of the estimate S*(Y) is around 99 percent and that of SM(Y) is about 98 percent. 相似文献
6.
7.
Publish-Subscribe Information Delivery with Substring Predicates 总被引:1,自引:0,他引:1
8.
Ryoichi Kato 《Information Processing Letters》2005,93(6):269-274
We present some simple but useful properties of factor oracles, and propose fast algorithms for indexed full-text search and finding repeated substrings. Some experiments are given to demonstrate the performance of our algorithms. 相似文献
9.
EH*S是可扩展分布式数据结构EH*的一个改进,增加了子串检索功能。通过对子串和关键字计算描述符向量,为EH*文件中的每个桶添加一个桶描述符向量,然后把子串描述符向量分别与关键字和桶的描述符向量进行比较,得到包含子串的关键字集。 相似文献
10.
在字符串的运算中,求两个字符串的最长公共子串是一个重要的算法,有着广泛的应用价值。一般认为一共有两大类解法,之所以叫两大类,是因为每一类都可以再细致划分。前一类易理解,占用内存单元大,时间复杂度低,后一类复杂,最好和KMP算法结合。 相似文献
11.
赵福生 《电脑与微电子技术》2011,(20):30-31,36
在字符串的运算中,求两个字符串的最长公共子串是一个重要的算法,有着广泛的应用价值。一般认为一共有两大类解法,之所以叫两大类,是因为每一类都可以再细致划分。前一类易理解,占用内存单元大,时间复杂度低,后一类复杂,最好和KMP算法结合。 相似文献
12.
The Closest Substring problem (the CSP problem) is a basic NP-hard problem in the study of computational biology. It is known that the problem has polynomial time approximation schemes. In this paper, we prove that unless the Exponential Time Hypothesis fails, the CSP problem has no polynomial time approximation schemes of running time f(1/ε)no(1/ε) for any function f. This essentially excludes the possibility that the CSP problem has a practical polynomial time approximation scheme even for moderate values of the error bound ε. As a consequence, it is unlikely that the study of approximation schemes for the CSP problem in the literature would lead to practical approximation algorithms for the problem for small error bound ε. 相似文献
13.
14.
15.
Benjamin A. Burton 《Algorithmica》2011,61(3):555-579
Given an arbitrary bitstream, we consider the problem of finding the longest substring whose ratio of ones to zeroes equals a given value. The central result of this paper is an algorithm that solves this problem in linear time. The method involves (i)?reformulating the problem as a constrained walk through a sparse matrix, and then (ii)?developing a data structure for this sparse matrix that allows us to perform each step of the walk in amortised constant time. We also give a linear time algorithm to find the longest substring whose ratio of ones to zeroes is bounded below by a given value. Both problems have practical relevance to cryptography and bioinformatics. 相似文献
16.
17.
18.
19.
针对民用GPS应用的局限性,设计了一种能自动地将实时位置状态信息发布出去,并在接收端的计算机上图形化的表示这些数据的位置自动报告系统.系统将GPS接收机输出的数据调制成音频信号,经无线电台发射,接收端利用解调软件将信号还原为位置状态数据来确定发射部分的位置及状态. 相似文献
20.
Range nearest-neighbor query 总被引:6,自引:0,他引:6
A range nearest-neighbor (RNN) query retrieves the nearest neighbor (NN) for every point in a range. It is a natural generalization of point and continuous nearest-neighbor queries and has many applications. In this paper, we consider the ranges as (hyper)rectangles and propose efficient in-memory processing and secondary memory pruning techniques for RNN queries in both 2D and high-dimensional spaces. These techniques are generalized for kRNN queries, which return the k nearest neighbors for every point in the range. In addition, we devise an auxiliary solution-based index EXO-tree to speed up any type of NN query. EXO-tree is orthogonal to any existing NN processing algorithm and, thus, can be transparently integrated. An extensive empirical study was conducted to evaluate the CPU and I/O performance of these techniques, and the study showed that they are efficient and robust under various data sets, query ranges, numbers of nearest neighbors, dimensions, and cache sizes. 相似文献