期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A fast algorithm for computing a longest common increasing subsequence

I-Hsuan Yang Kun-Mao Chao 《Information Processing Letters》2005,93(5):249-253

Let A=〈a₁,a₂,…,am〉 and B=〈b₁,b₂,…,bn〉 be two sequences, where each pair of elements in the sequences is comparable. A common increasing subsequence of A and B is a subsequence 〈ai₁=bj₁,ai₂=bj₂,…,ail=bjl〉, where i₁<i₂<?<il and j₁<j₂<?<jl, such that for all 1?k<l, we have aik<aik+1. A longest common increasing subsequence of A and B is a common increasing subsequence of the maximum length. This paper presents an algorithm for delivering a longest common increasing subsequence in O(mn) time and O(mn) space. 相似文献

2.

A New Efficient Algorithm for Computing the Longest Common Subsequence 总被引：1，自引：0，他引：1

Costas S. Iliopoulos M. Sohel Rahman 《Theory of Computing Systems》2009,45(2):355-371

The Longest Common Subsequence (LCS) problem is a classic and well-studied problem in computer science. The LCS problem is a common task in DNA sequence analysis with many applications to genetics and molecular biology. In this paper, we present a new and efficient algorithm for solving the LCS problem for two strings. Our algorithm runs in O(ℛlog log n+n) time, where ℛ is the total number of ordered pairs of positions at which the two strings match. Preliminary version appeared in [24]. C.S. Iliopoulos is supported by EPSRC and Royal Society grants. M.S. Rahman is supported by the Commonwealth Scholarship Commission in the UK under the Commonwealth Scholarship and Fellowship Plan (CSFP) and is on Leave from Department of CSE, BUET, Dhaka-1000, Bangladesh. 相似文献

3.

New efficient algorithms for the LCS and constrained LCS problems 总被引：1，自引：0，他引：1

Costas S. Iliopoulos 《Information Processing Letters》2008,106(1):13-18

In this paper, we study the classic and well-studied longest common subsequence (LCS) problem and a recent variant of it, namely the constrained LCS (CLCS) problem. In the CLCS problem, the computed LCS must also be a supersequence of a third given string. In this paper, we first present an efficient algorithm for the traditional LCS problem that runs in O(Rloglogn+n) time, where R is the total number of ordered pairs of positions at which the two strings match and n is the length of the two given strings. Then, using this algorithm, we devise an algorithm for the CLCS problem having time complexity O(pRloglogn+n) in the worst case, where p is the length of the third string. 相似文献

4.

An almost-linear time and linear space algorithm for the longest common subsequence problem

J.Y. Guo F.K. Hwang 《Information Processing Letters》2005,94(3):131-135

There are two general approaches to the longest common subsequence problem. The dynamic programming approach takes quadratic time but linear space, while the nondynamic-programming approach takes less time but more space. We propose a new implementation of the latter approach which seems to get the best for both time and space for the DNA application. 相似文献

5.

Variants of constrained longest common subsequence

Paola Bonizzoni Yuri Pirola 《Information Processing Letters》2010,110(20):877-881

We consider a variant of the classical Longest Common Subsequence problem called Doubly-Constrained Longest Common Subsequence (DC-LCS). Given two strings s₁ and s₂ over an alphabet Σ, a set Cs of strings, and a function Co:Σ→N, the DC-LCS problem consists of finding the longest subsequence s of s₁ and s₂ such that s is a supersequence of all the strings in Cs and such that the number of occurrences in s of each symbol σ∈Σ is upper bounded by Co(σ). The DC-LCS problem provides a clear mathematical formulation of a sequence comparison problem in Computational Biology and generalizes two other constrained variants of the LCS problem that have been introduced previously in the literature: the Constrained LCS and the Repetition-Free LCS. We present two results for the DC-LCS problem. First, we illustrate a fixed-parameter algorithm where the parameter is the length of the solution which is also applicable to the more specialized problems. Second, we prove a parameterized hardness result for the Constrained LCS problem when the parameter is the number of the constraint strings (|Cs|) and the size of the alphabet Σ. This hardness result also implies the parameterized hardness of the DC-LCS problem (with the same parameters) and its NP-hardness when the size of the alphabet is constant. 相似文献

6.

Doubly-Constrained LCS and Hybrid-Constrained LCS problems revisited

Effat Farhana M. Sohel Rahman 《Information Processing Letters》2012,112(13):562-565

We revisit two recently studied variants of the classic Longest Common Subsequence (LCS) problem, namely, the Doubly-Constrained LCS (DC-LCS) and Hybrid-Constrained LCS (HC-LCS) problems. We present finite automata based algorithms for both problems. 相似文献

7.

An algorithm for solving the longest increasing circular subsequence problem

Sebastian Deorowicz 《Information Processing Letters》2009,109(12):630-634

相似文献

8.

Classes of cost functions for string edit distance

S. V. Rice H. Bunke T. A. Nartker 《Algorithmica》1997,18(2):271-280

Finding a sequence of edit operations that transforms one string of symbols into another with the minimum cost is a well-known problem. The minimum cost, or edit distance, is a widely used measure of the similarity of two strings. An important parameter of this problem is the cost function, which specifies the cost of each insertion, deletion, and substitution. We show that cost functions having the same ratio of the sum of the insertion and deletion costs divided by the substitution cost yield the same minimum cost sequences of edit operations. This leads to a partitioning of the universe of cost functions into equivalence classes. Also, we show the relationship between a particular set of cost functions and the longest common subsequence of the input strings. This work was supported in part by the U.S. Department of Defense and the U.S. Department of Energy. 相似文献

9.

Parallel comparison of run-length-encoded strings on a linear systolic array

Alessandro Bogliolo Valerio Freschi 《Information Sciences》2007,177(1):231-238

The length of the longest common subsequence (LCS) between two strings of M and N characters can be computed by an O(M × N) dynamic programming algorithm, that can be executed in O(M + N) steps by a linear systolic array. It has been recently shown that the LCS between run-length-encoded (RLE) strings of m and n runs can be computed by an O(nM + Nm − nm) algorithm that could be executed in O(m + n) steps by a parallel hardware. However, the algorithm cannot be directly mapped on a linear systolic array because of its irregular structure.In this paper, we propose a modified algorithm that exhibits a more regular structure at the cost of a marginal reduction of the efficiency of RLE. We outline the algorithm and we discuss its mapping on a linear systolic array. 相似文献

10.

Valerio Freschi 《Information Processing Letters》2004,90(4):167-173

Data compression can be used to simultaneously reduce memory, communication and computation requirements of string comparison. In this paper we address the problem of computing the length of the longest common subsequence (LCS) between run-length-encoded (RLE) strings. We exploit RLE both to reduce the complexity of LCS computation from O(M×N) to O(mN+Mn−mn), where M and N are the lengths of the original strings and m and n the number of runs in their RLE representation, and to improve the inherent parallelism of the proposed algorithm, so that it may execute in O(m+n) steps on a systolic array of M+N units.We also discuss the application of the proposed algorithm to the related problem of edit distance (ED) computation. 相似文献

11.

Applying VSM and LCS to develop an integrated text retrieval mechanism 总被引：1，自引：0，他引：1

Cheng-Shiun Tasi 《Expert systems with applications》2012,39(4):3974-3982

Text retrieval has received a lot of attention in computer science. In the text retrieval field, the most widely-adopted similarity technique is using vector space models (VSM) to evaluate the weight of terms and using Cosine, Jaccard or Dice to measure the similarity between the query and the texts. However, these similarity techniques do not consider the effect of the sequence of the information. In this paper, we propose an integrated text retrieval (ITR) mechanism that takes the advantage of both VSM and longest common subsequence (LCS) algorithm. The key idea of the ITR mechanism is to use LCS to re-evaluate the weight of terms, so that the sequence and weight relationships between the query and the texts can be considered simultaneously. The results of mathematical analysis show that the ITR mechanism can increase the similarity on Jaccard and Dice similarity measurements when a sequential relationship exists between the query and the texts. 相似文献

12.

An improved algorithm for the longest common subsequence problem

Sayyed Rasoul Mousavi Farzaneh Tabataba 《Computers & Operations Research》2012,39(3):512-520

The Longest Common Subsequence problem seeks a longest subsequence of every member of a given set of strings. It has applications, among others, in data compression, FPGA circuit minimization, and bioinformatics. The problem is NP-hard for more than two input strings, and the existing exact solutions are impractical for large input sizes. Therefore, several approximation and (meta) heuristic algorithms have been proposed which aim at finding good, but not necessarily optimal, solutions to the problem. In this paper, we propose a new algorithm based on the constructive beam search method. We have devised a novel heuristic, inspired by the probability theory, intended for domains where the input strings are assumed to be independent. Special data structures and dynamic programming methods are developed to reduce the time complexity of the algorithm. The proposed algorithm is compared with the state-of-the-art over several standard benchmarks including random and real biological sequences. Extensive experimental results show that the proposed algorithm outperforms the state-of-the-art by giving higher quality solutions with less computation time for most of the experimental cases. 相似文献

13.

Beam search for the longest common subsequence problem

Christian Blum Maria J. Blesa Manuel Lpez-Ibez 《Computers & Operations Research》2009,36(12):3178

The longest common subsequence problem is a classical string problem that concerns finding the common part of a set of strings. It has several important applications, for example, pattern recognition or computational biology. Most research efforts up to now have focused on solving this problem optimally. In comparison, only few works exist dealing with heuristic approaches. In this work we present a deterministic beam search algorithm. The results show that our algorithm outperforms the current state-of-the-art approaches not only in solution quality but often also in computation time. 相似文献

14.

《International Journal of Parallel, Emergent and Distributed Systems》2012,27(1):1-18

The longest common subsequence (LCS) problem is to find an LCS of two given sequences and the length of the LCS. In this paper, an efficient systolic algorithm for the LCS problem is derived. For two sequences of length m and n, where m ≥ n, the problem can be solved with only [n/2] processors in m + 2[n/2] − 1 time steps. Compared with other systolic algorithms that solve the LCS problem, our algorithm not only takes fewer time steps but also uses fewer processors. Our algorithm is better suited to implementation on multicomputers than other systolic algorithms. 相似文献

15.

Shift-or string matching with super-alphabets

Kimmo Fredriksson 《Information Processing Letters》2003,87(4):201-204

Given a text T[1…n] and a pattern P[1…m] over some alphabet Σ of size σ, we want to find all the (exact) occurrences of P in T. The well-known shift-or algorithm solves this problem in time O(n⌈m/w⌉), where w is the number of bits in machine word, using bit-parallelism. We show how to extend the bit-parallelism in another direction, using super-alphabets. This gives a speed-up by a factor s, where s is the number of characters processed simultaneously. The algorithm is implemented, and we show that it works well in practice too. The result is the fastest known algorithm for exact string matching for short patterns and small alphabets. 相似文献

16.

Efficient parameterized string matching

Kimmo Fredriksson Maxim Mozgovoy 《Information Processing Letters》2006,100(3):91-96

In parameterized string matching the pattern P matches a substring t of the text T if there exist a bijective mapping from the symbols of P to the symbols of t. We give simple and practical algorithms for finding all such pattern occurrences in sublinear time on average. The algorithms work for a single and multiple patterns. 相似文献

17.

On-line approximate string matching with bounded errors

Marcos Kiwi 《Theoretical computer science》2011,412(45):6359-6370

相似文献

18.

Scaled and permuted string matching

Ayelet Butman Gad M. Landau 《Information Processing Letters》2004,92(6):293-297

The goal of scaled permuted string matching is to find all occurrences of a pattern in a text, in all possible scales and permutations. Given a text of length n and a pattern of length m we present an O(n) algorithm. 相似文献

19.

Improving practical exact string matching

Branislav ?urian 《Information Processing Letters》2010,110(4):148-152

We present improved variations of the BNDM algorithm for exact string matching. At each alignment our bit-parallel algorithms process a q-gram before testing the state variable. In addition we apply reading a 2-gram in one instruction. Our point of view is practical efficiency of algorithms. Our experiments show that the new variations are faster than earlier algorithms in many cases. 相似文献

20.

Most discriminating segment – Longest common subsequence (MDSLCS) algorithm for dynamic hand gesture classification

Helman Stern Merav Shmueli Sigal Berman 《Pattern recognition letters》2013

In this work, we consider the recognition of dynamic gestures based on representative sub-segments of a gesture, which are denoted as most discriminating segments (MDSs). The automatic extraction and recognition of such small representative segments, rather than extracting and recognizing the full gestures themselves, allows for a more discriminative classifier. A MDS is a sub-segment of a gesture that is most dissimilar to all other gesture sub-segments. Gestures are classified using a MDSLCS algorithm, which recognizes the MDSs using a modified longest common subsequence (LCS) measure. The extraction of MDSs from a data stream uses adaptive window parameters, which are driven by the successive results of multiple calls to the LCS classifier. In a preprocessing stage, gestures that have large motion variations are replaced by several forms of lesser variation. We learn these forms by adaptive clustering of a training set of gestures, where we reemploy the LCS to determine similarity between gesture trajectories. The MDSLCS classifier achieved a gesture recognition rate of 92.6% when tested using a set of pre-cut free hand digit (0–9) gestures, while hidden Markov models (HMMs) achieved an accuracy of 89.5%. When the MDSLCS was tested against a set of streamed digit gestures, an accuracy of 89.6% was obtained. At present the HMMs method is considered the state-of-the-art method for classifying motion trajectories. The MDSLCS algorithm had a higher accuracy rate for pre-cut gestures, and is also more suitable for streamed gestures. MDSLCS provides a significant advantage over HMMs by not requiring data re-sampling during run-time and performing well with small training sets. 相似文献