期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Approximation algorithms for tree alignment with a given phylogeny 总被引：3，自引：0，他引：3

Lusheng Wang Tao Jiang E. L. Lawler 《Algorithmica》1996,16(3):302-315

We study the following fundamental problem in computational molecular biology: Given a set of DNA sequences representing some species and a phylogenetic tree depicting the ancestral relationship among these species, compute an optimal alignment of the sequences by the means of constructing a minimum-cost evolutionary tree. The problem is an important variant of multiple sequence alignment, and is widely known astree alignment. We design an efficient approximation algorithm with performance ratio 2 for tree alignment. The algorithm is then extended to a polynomial-time approximation scheme. The construction actually works for Steiner trees in any metric space, and thus implies a polynomial-time approximation scheme for planar Steiner trees under a given topology (with any constant degree). To our knowledge, this is the first polynomial-time approximation scheme in the fields of computational biology and Steiner trees. The approximation algorithms may be useful in evolutionary genetics practice as they can provide a good initial alignment for the iterative method in [23].Supported in part by NSERC Operating Grant OGP0046613.Supported in part by NSERC Operating Grant OGP0046613 and a Canadian Genome Analysis and Technology Research Grant.Supported in part by US Department of Energy Grant DE-FG03-90ER6099. 相似文献

2.

Non-approximability of weighted multiple sequence alignment for arbitrary metrics

Bodo Manthey 《Information Processing Letters》2005,95(3):389-395

We prove that the multiple sequence alignment problem with weighted sum-of-pairs score is APX-hard for arbitrary metric scoring functions over the binary alphabet. This holds even when the weights are restricted to zero and one. 相似文献

3.

On exact solutions to the Euclidean bottleneck Steiner tree problem

Sang Won Bae Chunseok Lee 《Information Processing Letters》2010,110(16):672-678

We study the Euclidean bottleneck Steiner tree problem: given a set P of n points in the Euclidean plane and a positive integer k, find a Steiner tree with at most k Steiner points such that the length of the longest edge in the tree is minimized. This problem is known to be NP-hard even to approximate within ratio and there was no known exact algorithm even for k=1 prior to this work. In this paper, we focus on finding exact solutions to the problem for a small constant k. Based on geometric properties of optimal location of Steiner points, we present an optimal -time exact algorithm for k=1 and an O(n²)-time algorithm for k=2. Also, we present an optimal -time exact algorithm for any constant k for a special case where there is no edge between Steiner points. 相似文献

4.

A high performance multiple sequence alignment system for pyrosequencing reads from multiple reference genomes

Fahad SaeedAuthor VitaeAlan Perez-RathkeAuthor Vitae Jaroslaw GwarnickiAuthor VitaeTanya Berger-WolfAuthor Vitae Ashfaq KhokharAuthor Vitae 《Journal of Parallel and Distributed Computing》2012,72(1):83-93

Genome resequencing with short reads generated from pyrosequencing generally relies on mapping the short reads against a single reference genome. However, mapping of reads from multiple reference genomes is not possible using a pairwise mapping algorithm. In order to align the reads w.r.t each other and the reference genomes, existing multiple sequence alignment(MSA) methods cannot be used because they do not take into account the position of these short reads with respect to the genome, and are highly inefficient for a large number of sequences. In this paper, we develop a highly scalable parallel algorithm based on domain decomposition, referred to as P-Pyro-Align, to align such a large number of reads from single or multiple reference genomes. The proposed alignment algorithm accurately aligns the erroneous reads, and has been implemented on a cluster of workstations using MPI library. Experimental results for different problem sizes are analyzed in terms of execution time, quality of the alignments, and the ability of the algorithm to handle reads from multiple haplotypes. We report high quality multiple alignment of up to 0.5 million reads. The algorithm is shown to be highly scalable and exhibits super-linear speedups with increasing number of processors. 相似文献

5.

Large scale multiple sequence alignment with simultaneous phylogeny inference

Gilles Denis Jaroslaw 《Journal of Parallel and Distributed Computing》2006,66(12):1534-1545

Multiple sequence alignment (MSA) and phylogenetic tree reconstruction are one of the most important problems in the computational biology. While both these problems are of great practical significance, in most cases they are very computationally demanding. In this paper we propose a new approach to the MSA problem which simultaneously infers an underlying phylogenetic tree. To process large data sets we provide parallel implementation of our method, which is based on the distributed caching of intermediate results. Finally, we show a parallel server designed for grid environments, and we report results of experiments performed with actual biological data, e.g. 1000 ribosomal RNA sequences. 相似文献

6.

On the partial terminal Steiner tree problem 总被引：1，自引：0，他引：1

Sun-Yuan Hsieh Huang-Ming Gao 《The Journal of supercomputing》2007,41(1):41-52

We investigate a practical variant of the well-known graph Steiner tree problem. For a complete graph G = ( V,E ) with length function l:E → R ⁺ and two vertex subsets R ⊂ V and R ′ ⊆ R, a partial terminal Steiner tree is a Steiner tree which contains all vertices in R such that all vertices in R ∖ R ′ belong to the leaves of this Steiner tree. The partial terminal Steiner tree problem is to find a partial terminal Steiner tree T whose total lengths ∑ _{(u,v)
∈T} l ( u,v ) is minimum. In this paper, we show that the problem is both NP-complete and MAX SNP-hard when the lengths of edges are restricted to either 1 or 2. We also provide an approximation algorithm for the problem.

Sun-Yuan HsiehEmail:

相似文献

7.

On approximation algorithms for the terminal Steiner tree problem 总被引：1，自引：0，他引：1

Doratha E. Drake 《Information Processing Letters》2004,89(1):15-18

The terminal Steiner tree problem is a special version of the Steiner tree problem, where a Steiner minimum tree has to be found in which all terminals are leaves. We prove that no polynomial time approximation algorithm for the terminal Steiner tree problem can achieve an approximation ratio less than (1−o(1))lnn unless NP has slightly superpolynomial time algorithms. Moreover, we present a polynomial time approximation algorithm for the metric version of this problem with a performance ratio of 2ρ, where ρ denotes the best known approximation ratio for the Steiner tree problem. This improves the previously best known approximation ratio for the metric terminal Steiner tree problem of ρ+2. 相似文献

8.

On the approximability of two tree drawing conventions

Paolo Penna 《Information Processing Letters》2002,82(5):237-242

We consider two aesthetic criteria for the visualization of rooted trees: inclusion and tip-over. Finding the minimum area layout according to either of these two standards is an NP-hard task, even when we restrict ourselves to binary trees.We provide a fully polynomial time approximation scheme for this problem. This result applies to any tree for tip-over layouts and to bounded degree trees in the case of the inclusion convention. We also prove that such restriction is necessary since, for unbounded degree trees, the inclusion problem is strongly NP-hard. Hence, neither a fully polynomial time approximation scheme nor a pseudopolynomial time algorithm exists, unless P=NP. Our technique, combined with the parallel algorithm by Metaxas et al. [Comput. Geom. 9 (1998) 145-158], also yields an NC fully parallel approximation scheme. This latter result holds for inclusion of binary trees and for the slicing floorplanning problem. Although this problem is in P, it is unknown whether it belongs to NC or not. All the above results also apply to other size functions of the drawing (e.g., the perimeter). 相似文献

9.

On the terminal Steiner tree problem

Guohui Lin Guoliang Xue 《Information Processing Letters》2002,84(2):103-107

We investigate a practical variant of the well-known graph Steiner tree problem. In this variant, every target vertex is required to be a leaf vertex in the solution Steiner tree. We present hardness results for this variant as well as a polynomial time approximation algorithm with performance ratio ρ+2, where ρ is the best-known approximation ratio for the graph Steiner tree problem. 相似文献

10.

Approximation Algorithms for Bounded Degree Phylogenetic Roots

Zhi-Zhong Chen 《Algorithmica》2008,51(1):1-23

The Degree- Δ Closest Phylogenetic k th Root Problem (ΔCPR_k) is the problem of finding a (phylogenetic) tree T from a given graph G=(V,E) such that (1) the degree of each internal node in T is at least 3 and at most Δ, (2) the external nodes (i.e. leaves) of T are exactly the elements of V, and (3) the number of disagreements, i.e., |E ⊕{{u,v} : u,v are leaves of T and d _T(u,v)≤k}|, is minimized, where d _T(u,v) denotes the distance between u and v in tree T. This problem arises from theoretical studies in evolutionary biology and generalizes several important combinatorial optimization problems such as the maximum matching problem. Unfortunately, it is known to be NP-hard for all fixed constants Δ,k such that either both Δ≥3 and k≥3, or Δ>3 and k=2. This paper presents a polynomial-time 8-approximation algorithm for Δ CPR ₂ for any fixed Δ>3, a quadratic-time 12-approximation algorithm for 3CPR ₃, and a polynomial-time approximation scheme for the maximization version of Δ CPR _k for any fixed Δ and k. 相似文献

11.

On the approximation of protein threading

Tatsuya Akutsu Satoru Miyano 《Theoretical computer science》1999,210(2):945-275

In this paper, we study the protein threading problem, which was proposed for predicting a folded 3D protein structure from an amino acid sequence. Since this problem was already proved to be NP-hard, we study polynomial time approximation algorithms. We show several hardness results for the approximation, which includes a MAX SNP-hardness result. We also show approximation algorithms for a special case and a general case, where a graph representing interactions between amino acid residues is restricted to be planar in a special case. For this special case, we obtain a constant approximation ratio. 相似文献

12.

Minimum-link watchman tours

Esther M Arkin Joseph S.B Mitchell Christine D Piatko 《Information Processing Letters》2003,86(4):203-207

We consider the problem of computing a watchman route in a polygon with holes. We show that the problem of finding a minimum-link watchman route is NP-complete, even if the holes are all convex. The proof is based on showing that the related problem of finding a minimum-link tour on a set of points in the plane is NP-complete. We provide a provably good approximation algorithm that achieves an approximation factor of O(logn). 相似文献

13.

On the hardness of approximating Max-Satisfy

Uriel Feige Daniel Reichman 《Information Processing Letters》2006,97(1):31-35

Max-Satisfy is the problem of finding an assignment that satisfies the maximum number of equations in a system of linear equations over Q. We prove that unless NP⊂BPP Max-Satisfy cannot be efficiently approximated within an approximation ratio of 1/n1−?, if we consider systems of n linear equations with at most n variables and ?>0 is an arbitrarily small constant. Previously, it was known that the problem is NP-hard to approximate within a ratio of 1/nα, but 0<α<1 was some specific constant that could not be taken to be arbitrarily close to 1. 相似文献

14.

Elementary approximation algorithms for prize collecting Steiner tree problems

Shai Gutner 《Information Processing Letters》2008,107(1):39-44

This paper deals with approximation algorithms for the prize collecting generalized Steiner forest problem, defined as follows. The input is an undirected graph G=(V,E), a collection T={T₁,…,T_k}, each a subset of V of size at least 2, a weight function , and a penalty function . The goal is to find a forest F that minimizes the cost of the edges of F plus the penalties paid for subsets T_i whose vertices are not all connected by F. Our main result is a -approximation for the prize collecting generalized Steiner forest problem, where n2 is the number of vertices in the graph. This obviously implies the same approximation for the special case called the prize collecting Steiner forest problem (all subsets T_i are of size 2). The approximation algorithm is obtained by applying the local ratio method, and is much simpler than the best known combinatorial algorithm for this problem.Our approach gives a -approximation for the prize collecting Steiner tree problem (all subsets T_i are of size 2 and there is some root vertex r that belongs to all of them). This latter algorithm is in fact the local ratio version of the primal-dual algorithm of Goemans and Williamson [M.X. Goemans, D.P. Williamson, A general approximation technique for constrained forest problems, SIAM Journal on Computing 24 (2) (April 1995) 296–317]. Another special case of our main algorithm is Bar-Yehuda's local ratio -approximation for the generalized Steiner forest problem (all the penalties are infinity) [R. Bar-Yehuda, One for the price of two: a unified approach for approximating covering problems, Algorithmica 27 (2) (June 2000) 131–144]. Thus, an important contribution of this paper is in providing a natural generalization of the framework presented by Goemans and Williamson, and later by Bar-Yehuda. 相似文献

15.

Improved approximation of the largest common subtree of two unordered trees of bounded height

Tatsuya Akutsu Daiji Fukagawa Atsuhiro Takasu 《Information Processing Letters》2008,109(2):165-170

We present a polynomial time 1.5h-approximation algorithm for the problem of finding the largest common subtree between two rooted, labeled, and unordered trees of height at most h, where a tree S is called a subtree of a tree T if S is obtained from T by deletion of some nodes in T. This result improves the previous 2h-approximation algorithm. 相似文献

16.

A note on the terminal Steiner tree problem

Bernhard Fuchs 《Information Processing Letters》2003,87(4):219-220

In 2002, Lin and Xue [Inform. Process. Lett. 84 (2002) 103-107] introduced a variant of the graph Steiner tree problem, in which each terminal vertex is required to be a leaf in the solution Steiner tree. They presented a ρ+2 approximation algorithm, where ρ is the approximation ratio of the best known efficient algorithm for the regular graph Steiner tree problem. In this note, we derive a simplified algorithm with an improved approximation ratio of 2ρ (currently ρ≈1.550, see [SODA 2000, 2000, pp. 770-790]). 相似文献

17.

Exact and approximation algorithms for sorting by reversals,with application to genome rearrangement 总被引：6，自引：0，他引：6

J. Kececioglu D. Sankoff 《Algorithmica》1995,13(1-2):180-210

Motivated by the problem in computational biology of reconstructing the series of chromosome inversions by which one organism evolved from another, we consider the problem of computing the shortest series of reversals that transform one permutation to another. The permutations describe the order of genes on corresponding chromosomes, and areversal takes an arbitrary substring of elements, and reverses their order.For this problem, we develop two algorithms: a greedy approximation algorithm, that finds a solution provably close to optimal inO(n ²) time and0(n) space forn-element permutations, and a branch- and-bound exact algorithm, that finds an optimal solution in0(mL(n, n)) time and0(n ²) space, wherem is the size of the branch- and-bound search tree, andL(n, n) is the time to solve a linear program ofn variables andn constraints. The greedy algorithm is the first to come within a constant factor of the optimum; it guarantees a solution that uses no more than twice the minimum number of reversals. The lower and upper bounds of the branch- and-bound algorithm are a novel application of maximum-weight matchings, shortest paths, and linear programming.In a series of experiments, we study the performance of an implementation on random permutations, and permutations generated by random reversals. For permutations differing byk random reversals, we find that the average upper bound on reversal distance estimatesk to within one reversal fork<1/2n andn<100. For the difficult case of random permutations, we find that the average difference between the upper and lower bounds is less than three reversals forn<50. Due to the tightness of these bounds, we can solve, to optimality, problems on 30 elements in a few minutes of computer time. This approaches the scale of mitochondrial genomes.This research was supported by a postdoctoral fellowship from the Program in Mathematics and Molecular Biology of the University of California at Berkeley under National Science Foundation Grant DMS-8720208, and by a fellowship from the Centre de recherches mathématiques of the Université de Montréal.This research was supported by grants from the Natural Sciences and Engineering Research Council of Canada, and the Fonds pour la formation de chercheurs et l'aide à la recherche (Québec). The author is a fellow of the Canadian Institute for Advanced Research. 相似文献

18.

A note on the minimum label spanning tree

Yingyu Wan Guoliang ChenYinlong Xu 《Information Processing Letters》2002,84(2):99-101

We give a tight analysis of the greedy algorithm introduced by Krumke and Wirth for the minimum label spanning tree problem. The algorithm is shown to be a (ln(n−1)+1)-approximation for any graph with n nodes (n>1), which improves the known performance guarantee 2lnn+1. 相似文献

19.

基于最小生成树的多序列联配算法

胡桂武郑启伦彭宏邓伟林《计算机科学》2005,32(4):59-61

多序列联配(MAS)是现代生物信息学中的重要工具之一,MAS问题是NP-难的,因此需要一些启发式方法在合理的时间内联配大的数据集。本文提出了一个基于最小生成树的多序列联配算法,并使用BALiBASE标准数据集合,对我们的算法进行了性能评价,结果表明算法较之ClustalX类的算法其精确度更高。相似文献

20.

Sublinear approximate string matching and biological applications

W. I. Chang E. L. Lawler 《Algorithmica》1994,12(4-5):327-344

Given a text string of lengthn and a pattern string of lengthm over ab-letter alphabet, thek differences approximate string matching problem asks for all locations in the text where the pattern occurs with at mostk differences (substitutions, insertions, deletions). We treatk not as a constant but as a fraction ofm (not necessarily constant-fraction). Previous algorithms require at leastO(kn) time (or exponential space). We give an algorithm that is sublinear time0((n/m)k log _b m) when the text is random andk is bounded by the threshold m/(log_b m + O(1)). In particular, whenk=o(m/log_b m) the expected running time iso(n). In the worst case our algorithm is O(kn), but is still an improvement in that it is practical and uses0(m) space compared with0(n) or0(m ²). We define three problems motivated by molecular biology and describe efficient algorithms based on our techniques: (1) approximate substring matching, (2) approximate-overlap detection, and (3) approximate codon matching. Respectively, applications to biology are local similarity search, sequence assembly, and DNA-protein matching.This work was supported in part by NSF Grants CCR-87-04184 and FD-89-02813; by the Human Genome Center, Lawrence Berkeley Laboratory, supported by the Director, Office of Health and Environmental Research, of the U.S. Department of Energy under Contract DE-AC03-76SF00098; and by Department of Energy Grants DE-FG03-90ER60999 and DE-FG02-91ER61190. Earlier versions of this paper appeared as [8] and part of [5]. 相似文献