首页 | 本学科首页   官方微博 | 高级检索  
     


Exact and approximation algorithms for sorting by reversals,with application to genome rearrangement
Authors:J Kececioglu  D Sankoff
Affiliation:(1) Department of Computer Science, The University of Georgia, 30602 Athens, GA, USA;(2) Centre de recherches mathématiques, Université de Montréal, succursale A, Case postal 6128, H3C 3J7 Montréal, Québec, Canada
Abstract:Motivated by the problem in computational biology of reconstructing the series of chromosome inversions by which one organism evolved from another, we consider the problem of computing the shortest series of reversals that transform one permutation to another. The permutations describe the order of genes on corresponding chromosomes, and areversal takes an arbitrary substring of elements, and reverses their order.For this problem, we develop two algorithms: a greedy approximation algorithm, that finds a solution provably close to optimal inO(n 2) time and0(n) space forn-element permutations, and a branch- and-bound exact algorithm, that finds an optimal solution in0(mL(n, n)) time and0(n 2) space, wherem is the size of the branch- and-bound search tree, andL(n, n) is the time to solve a linear program ofn variables andn constraints. The greedy algorithm is the first to come within a constant factor of the optimum; it guarantees a solution that uses no more than twice the minimum number of reversals. The lower and upper bounds of the branch- and-bound algorithm are a novel application of maximum-weight matchings, shortest paths, and linear programming.In a series of experiments, we study the performance of an implementation on random permutations, and permutations generated by random reversals. For permutations differing byk random reversals, we find that the average upper bound on reversal distance estimatesk to within one reversal fork<1/2n andn<100. For the difficult case of random permutations, we find that the average difference between the upper and lower bounds is less than three reversals forn<50. Due to the tightness of these bounds, we can solve, to optimality, problems on 30 elements in a few minutes of computer time. This approaches the scale of mitochondrial genomes.This research was supported by a postdoctoral fellowship from the Program in Mathematics and Molecular Biology of the University of California at Berkeley under National Science Foundation Grant DMS-8720208, and by a fellowship from the Centre de recherches mathématiques of the Université de Montréal.This research was supported by grants from the Natural Sciences and Engineering Research Council of Canada, and the Fonds pour la formation de chercheurs et l'aide à la recherche (Québec). The author is a fellow of the Canadian Institute for Advanced Research.
Keywords:Computational biology  Approximation algorithms  Branch- and-bound algorithms  Experimental analysis of algorithms  Edit distance  Permutations  Sorting by reversals  Chromosome inversions  Genome rearrangements
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号