首页 | 本学科首页   官方微博 | 高级检索  
     


Evolution of simple sequence repeats
Affiliation:1. Department of Cardiovascular and Thoracic Surgery, Cardiovascular Innovation Institute, University of Louisville School of Medicine, Louisville, KY, United States;2. Department of Surgery, Artificial Organs Laboratory, University of Maryland School of Medicine, Baltimore, MD, United States;3. Department of Clinical Engineering, University of Maryland Medical Center, Baltimore, MD, United States;1. Department of Life Sciences and Systems Biology, University of Torino, Torino, Italy;2. Department of Chemistry, University of Oxford, Oxford, UK;3. Department of Molecular Biotechnology and Health, University of Torino, Torino, Italy;4. Agroinnova, Centre of Competence for the Innovation in the Agro-Environmental Sector, University of Torino, Largo Paolo Braccini 2, Grugliasco, Torino, Italy
Abstract:Simple Sequence Repeats (SSRs) are common and frequently polymorphic in eukaryote DNA. Many are subject to high rates of length mutation in which a gain or loss of one repeat unit is most often observed. Can the observed abundances and their length distributions be explained as the result of an unbiased random walk, starting from some initial repeat length? In order to address this question, we have considered two models for an unbiased random walk on the integers, n (n0n). The first is a continuous time process (Birth and Death Model or BDM) in which the probability of a transition to n + 1 or n − 1 is λk, with k = nn0 + 1 per unit time. The second is a discrete time model (Random Walk Model or RWM), in which a transition is made at each time step, either to n − 1 or to n + 1. In each case the walks start at length n0, with new walks being generated at a steady rate, S, the source rate, determined by a base substitution rate of mutation from neighboring sequences. Each walk terminates whenever n reaches n0 − 1 or at some time, T, which reflects the contamination of pure repeat sequences by other mutations that remove them from consideration, either because they fail to satisfy the criteria for repeat selection from some database or because they can no longer undergo efficient length mutations. For infinite T, the results are particularly simple for N(k), the expected number of repeats of length n = k + n0 − 1, being, for BDM, N(k) = S/, and for RWM, N(k) = 2S. In each case, there is a cut-off value of k for finite T, namely k = ln2 for BDM and k = 0.57√T for RWM; for larger values of k, N(k) becomes rapidly smaller than the infinite time limit. We argue that these results may be compared with SSR length distributions averaged over many loci, but not for a particular locus, for which founder effects are important. For the data of Beckmann & Weber (1992), Genomics 12, 627] on GT·AC repeats in the human, each model gives a reasonable fit to the data, with the source at two repeat units (n0 = 2). Both the absolute number of loci and their length distribution are well represented.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号