首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Longest common subsequence is a widely used measure to compare strings, in particular in computational biology. Recently, several variants of the longest common subsequence have been introduced to tackle the comparison of genomes. In particular, the Repetition Free Longest Common Subsequence (RFLCS) problem is a variant of the LCS problem that asks for a longest common subsequence of two input strings with no repetition of symbols. In this paper, we investigate the parameterized complexity of RFLCS. First, we show that the problem does not admit a polynomial kernel. Then, we present a randomized FPT algorithm for the RFLCS problem, improving the time complexity of the existent FPT algorithm.  相似文献   

In the longest common subsequence problem the task is to find the longest sequence of letters that can be found as a subsequence in all members of a given finite set of sequences. The problem is one of the fundamental problems in computer science with the task of finding a given pattern in a text as an important special case. It has applications in bioinformatics; problem-specific algorithms and facts about its complexity are known. Motivated by reports about good performance of evolutionary algorithms for some instances of this problem a theoretical analysis of a generic evolutionary algorithm is performed. The general algorithmic framework encompasses EAs as different as steady state GAs with uniform crossover and randomized hill-climbers. For all these algorithms it is proved that even rather simple special cases of the longest common subsequence problem can neither be solved to optimality nor approximately be solved up to an approximation factor arbitrarily close to 2.  相似文献   

归泳昆 《计算机科学》2008,35(3):264-266
最长公共子串(LCS)和最长递增子串(LIS)是两个非常经典的基础算法问题,并且在生物信息学中已有重要应用.2006年,Brodal等人提出了最长公共弱递增字串问题(LCWIS),并且给出了2字符字母表上线性时间算法和3字符字母表上O(nlogn)时间的算法.本文中,我们提出了一种新的在3字符字母表上寻找最长公共弱递增子串(LC-WIS)的算法.该算法利用了两个成熟的数据结构:约束堆(Bounded heap)和van Emde Boas树.我们算法的时间复杂度是O(nloglogn),空间复杂度为0(n),两者都是目前为止最优的.  相似文献   

It is difficult to establish feature correspondences between distant viewpoints for panoramic images. For reliable navigation and development a human-like capability of interaction with the surrounding environment, we need a method of reduction of the uncertainty in feature tracking. To obtain a method of reduction of the uncertainty in feature tracking, we propose to use an algorithm for the problem of the longest common subsequence for a set of circular strings. We consider an explicit reduction from the problem of the longest common subsequence for a set of circular strings to the satisfiability problem. This reduction allows to obtain an efficient algorithm for finding the longest common subsequence for a set of circular strings. We present a general scheme of the method of reduction of the uncertainty in feature tracking. We considered the visual homing task to demonstrate the capabilities of our approach to solve the problem of reduction of the uncertainty in feature tracking. We present experimental results for the method of reduction of the uncertainty in feature tracking and novel robot visual homing methods.  相似文献   

The problems of finding a longest common subsequence and a shortest common supersequence of a set of strings are well known. They can be solved in polynomial time for two strings (in fact the problems are dual in this case), or for any fixed number of strings, by dynamic programming. But both problems are NP-hard in general for an arbitrary numberkof strings. Here we study the related problems of finding a shortest maximal common subsequence and a longest minimal common supersequence. We describe dynamic programming algorithms for the case of two strings (for which case the problems are no longer dual), which can be extended to any fixed number of strings. We also show that both problems are NP-hard in general forkstrings, although the latter problem, unlike shortest common supersequence, is solvable in polynomial time for strings of length 2. Finally, we prove a strong negative approximability result for the shortest maximal common subsequence problem.  相似文献   


We design linear time systolic-based parallel algorithms that run on two-dimensional arrays for both computing the length and recovering a longest common subsequence of three given sequences that are appropriate for very large-scale integration (VLSI) implementation. These problems have been qualified to be difficult to be solved in linear time in [14], and our approach, which generalizes the methods used for determining a longest common subsequence of two strings [28,38] to the case of three strings, enables to solve both problems in linear time. Given the three sequences A, B and C of length n, m and l (n ≤ m ≤ l;), we provide an algorithm that computes the length p of their longest common subsequence on a modular I/O bounded one-way two-dimensional array of nm processors in n + 2m + l? 1 time-steps. To compute a longest common subsequence of the three given strings, we show that each processor of the above array requires an O(min{n,p\) local storage to solve the problem in In + 1m + 1 + p — 2 time-steps.  相似文献   

最长公共子序列算法在文字录入测试中的应用   总被引:1,自引:0,他引:1  
测试Word中的文字录入内容,是开发Office等考试软件过程中的一个关键 技术问题.常用的处理方法是关键字检测法.通过检测Word文件中的关键字,可以测试出考 生操作的大概结果,但是很不准确.利用最长公共子序列算法进行文字录入测试,可以很 好地解决这一问题.  相似文献   

Finding the longest common subsequence of a given set of input strings is a relevant problem arising in various practical settings. One of these problems is the so-called longest arc-preserving common subsequence problem. This NP-hard combinatorial optimization problem was introduced for the comparison of arc-annotated ribonucleic acid (RNA) sequences. In this work we present an integer linear programming (ILP) formulation of the problem. As even in the context of rather small problem instances the application of a general purpose ILP solver is not viable due to the size of the model, we study alternative ways based on model reduction in order to take profit from this ILP model. First, we present a heuristic way for reducing the model, with the subsequent application of an ILP solver. Second, we propose the application of an iterative hybrid algorithm that makes use of an ILP solver for generating high quality solutions at each iteration. Experimental results concerning artificial and real problem instances show that the proposed techniques outperform an available technique from the literature.  相似文献   

孙焘  朱晓明 《计算机科学》2017,44(2):270-274
多条序列的最长公共子序列可以代表多条序列的公共信息,其在诸多领域里有着重要的应用,如信息检索、基因序列匹配等。求解多条序列的最长公共子序列是著名的NP难问题,本质为多解问题。一些近似算法虽然时间复杂度较低,但只能求出单解,对于有多解的序列集合,求得的结果信息量损失较大。因此提出一个新的近似算法来解决最长公共子序列问题。算法引入了代数结构“格”,通过动态规划求解出两条序列的公共格,并递归求解当前格与当前序列的公共格。公共格中的路径保存了多条公共子序列使得最终求解出的最长公共子序列为多个。对算法的相关定理给出了理论证明,并通过实验验证了算法的正确性。  相似文献   

为视频序列匹配提出一个高效精确的最大公共子序列(Efficient and Effective Longest Common Subsequence,EELCS)算法。首先,利用矢量量化(Vector Quantization,VQ)将多维最大公共子序列算法(Multi-dimensional LCS,MLCS)中元素对匹配过程中的实际距离的计算简化成比较操作,较原始的最大公共子序列匹配算法(Original LCS,OLCS),该处理不仅可以继承MLCS的可应用到实际多维时序匹配问题中的优点,同时大大降低了匹配的复杂度;然后进一步区分待匹配序列中由于匹配子序列和未匹配子序列在时间轴上连续性而产生的差异;最后将该算法应用到视频片段的匹配中。实验结果表明,与具有代表性的基于时间规扭曲的最大公共子序列(Time-Warped LCS,T-WLCS)和连续最大公共子序列(Continuous LCS,CLCS)相比,该算法能较好地应用于视频序列的匹配。  相似文献   

The constrained longest common subsequence problem   总被引:1,自引:0,他引:1  
This paper considers a constrained version of longest common subsequence problem for two strings. Given strings S1, S2 and P, the constrained longest common subsequence problem for S1 and S2 with respect to P is to find a longest common subsequence lcs of S1 and S2 such that P is a subsequence of this lcs. An O(rn2m2) time algorithm based upon the dynamic programming technique is proposed for this new problem, where n, m and r are lengths of S1, S2 and P, respectively.  相似文献   

A longest common subsequence (LCS) of two strings is a common subsequence of two strings of maximal length. The LCS problem is to find an LCS of two given strings and the length of the LCS (LLCS). In this paper, we present a new linear processor array for solving the LCS problem. The array is based on parallelization of a recent LCS algorithm which consists of two phases, i.e. preprocessing and computation. The computation phase is based on bit-level dynamic programming approach. Implementations of the preprocessing and computation phases are discussed on the same processor array architecture for the LCS problem. Further, we propose a block processor array architecture which reduces the overall communication and time requirements. Finally, we develop a performance model for estimating the performance of the processor array architecture on Pentium processors.  相似文献   

针对卷积神经网络进行语音识别时识别率较低的问题,结合序列的最大子序列理论,把真实数据和预测数据看作两个序列并计算两者的最大子序列,再使用欧氏距离计算MSLoss损失函数.使用闵氏距离和神经网络反向更新时的参数,提出自适应卷积核ACKS算法,根据网络传播情况动态地改变卷积核大小,改善模型在不同阶段对数据特性的提取效果.设...  相似文献   

The longest common subsequence problem is a classical string problem that concerns finding the common part of a set of strings. It has several important applications, for example, pattern recognition or computational biology. Most research efforts up to now have focused on solving this problem optimally. In comparison, only few works exist dealing with heuristic approaches. In this work we present a deterministic beam search algorithm. The results show that our algorithm outperforms the current state-of-the-art approaches not only in solution quality but often also in computation time.  相似文献   

There are two general approaches to the longest common subsequence problem. The dynamic programming approach takes quadratic time but linear space, while the nondynamic-programming approach takes less time but more space. We propose a new implementation of the latter approach which seems to get the best for both time and space for the DNA application.  相似文献   

A subsequence of a given string is any string obtained by deleting none or some symbols from the given string. A longest common subsequence (LCS) of two strings is a common subsequence of both that is as long as any other common subsequences. The problem is to find the LCS of two given strings. The bound on the complexity of this problem under the decision tree model is known to be mn if the number of distinct symbols that can appear in strings is infinite, where m and n are the lengths of the two strings, respectively, and m⩽n. In this paper, we propose two parallel algorithms far this problem on the CREW-PRAM model. One takes O(log2 m + log n) time with mn/log m processors, which is faster than all the existing algorithms on the same model. The other takes O(log2 m log log m) time with mn/(log2 m log log m) processors when log2 m log log m > log n, or otherwise O(log n) time with mn/log n processors, which is optimal in the sense that the time×processors bound matches the complexity bound of the problem. Both algorithms exploit nice properties of the LCS problem that are discovered in this paper  相似文献   

The longest common subsequence problem (LCS) and the closest substring problem (CSP) are two models for finding common patterns in strings, and have been studied extensively. Though both LCS and CSP are NP-Hard, they exhibit very different behavior with respect to polynomial time approximation algorithms. While LCS is hard to approximate within n δ for some δ>0, CSP admits a polynomial time approximation scheme. In this paper, we study the longest common rigid subsequence problem (LCRS). This problem shares similarity with both LCS and CSP and has an important application in motif finding in biological sequences. We show that it is NP-hard to approximate LCRS within ratio n δ , for some constant δ>0, where n is the maximum string length. We also show that it is NP-Hard to approximate LCRS within ratio Ω(m), where m is the number of strings.  相似文献   

《Parallel Computing》1986,3(3):217-229
We consider the case of a 2-dimensional wavefront array processor where only one wavefront appears at any time. We show that in such a situation, this 2-dimensional wavefront processor can be mapped to a linear array processor if the wavefronts never backtrack. The mapping will not increase the number of registers in each processor element. Two examples, the spoken words recognition problem and the longest common subsequence problem, are given to demonstrate the feasibility of this method.  相似文献   

A longest common subsequence (LCS) of two strings is a common subsequence of the two strings of maximal length. The LCS problem is to find an LCS of two given strings and the length of the LCS (LLCS). In this paper, a fast linear systolic algorithm that improves on previous systolic algorithms for solving the LCS problem is presented. For two given strings of length m and n, where m n, the LLCS and an LCS can be found in m + 2n – 1 time steps. This algorithm achieves the tight lower bound of the time complexity under the situation where symbols are input sequentially to a linear array of n processors. The systolic algorithm can be modified to take only m + n steps on multicomputers by using the scatter operation.  相似文献   

程序代码相似度度量的研究与实现   总被引:1,自引:1,他引:0       下载免费PDF全文
针对程序代码相似度的度量问题,提出一种属性计数和结构度量相结合的方法,通过统计程序源代码的操作符和操作数个数,产生Halstead长度、Halstead词汇和Halstead容量3个程序的特征向量,利用向量夹角的余弦计算属性相似度,采用最长公共子序列算法获取结构相似度,从而衡量程序对间的相似程度。实验结果表明,该方法能够有效检测出学生作业中的相似程序代码。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号