排序方式: 共有21条查询结果,搜索用时 453 毫秒
1.
The Individual Haplotyping MFR problem is a computational problem that, given a set of DNA sequence fragment data of an individual,
induces the corresponding haplotypes by dropping the minimum number of fragments. Bafna, Istrail, Lancia, and Rizzi proposed
an algorithm of time O(22k
m
2
n+23k
m
3) for the problem, where m is the number of fragments, n is the number of SNP sites, and k is the maximum number of holes in a fragment. When there are mate-pairs in the input data, the parameter k can be as large as 100, which would make the Bafna-Istrail-Lancia-Rizzi algorithm impracticable. The current paper introduces
a new algorithm PM-MFR of running time
, where k
1 is the maximum number of SNP sites that a fragment covers (k
1 is smaller than n), and k
2 is the maximum number of fragments that cover a SNP site (k
2 is usually about 10). Since the time complexity of the algorithm PM-MFR is not directly related to the parameter k, the algorithm solves the Individual Haplotyping MFR problem with mate-pairs more efficiently and is more practical in real
biological applications.
This research was supported in part by the National Natural Science Foundation of China under Grant Nos. 60433020 and 60773111,
the Program for New Century Excellent Talents in University No. NCET-05-0683, the Program for Changjiang Scholars and Innovative
Research Team in University No. IRT0661, and the Scientific Research Fund of Hunan Provincial Education Department under Grant
No. 06C526. 相似文献
2.
单体型检测在遗传病基因的定位、药理反应的研究、个体识别等方面有极其广阔的应用前景。单体型组装问题指如何利用个体的基因测序片断数据,根据不同的优化准则确定该个体单体型的计算问题。对MSR,MFR,MEC,WMLF,MEC/GI等单体型组装模型做了详细的分析比较,得出了如下结论:在没有引入测序误差情况下,上述模型的重构精度基本一致。随着测序误差的增加,MEC/GI模型的容错性最好,重构精度最高;MSR模型受测序误差的影响最大,只适用于测序误差极小的情形。 相似文献
3.
Ling-Yun Wu Zhenping Li Rui-Sheng Wang Xiang-Sun Zhang Luonan Chen 《Mathematics and computers in simulation》2009
Haplotype assembly is to reconstruct a pair of haplotypes from SNP values observed in a set of individual DNA fragments. In this paper, we focus on studying minimum error correction (MEC) model for the haplotype assembly problem and explore self-organizing map (SOM) methods for this problem. Specifically, haplotype assembly by MEC is formulated into an integer linear programming model. Since the MEC problem is NP-hard and thus cannot be solved exactly within acceptable running time for large-scale instances, we investigate the ability of classical SOMs to solve the haplotype assembly problem with MEC model. Then, aiming to overcome the limits of classical SOMs, a novel SOM approach is proposed for the problem. Extensive computational experiments on both synthesized and real datasets show that the new SOM-based algorithm can efficiently reconstruct haplotype pairs in a very high accuracy under realistic parameter settings. Comparison with previous methods also confirms the superior performance of the new SOM approach. 相似文献
4.
The Eurasian genotype of common reed (Phragmites australis) is one of the most aggressive plant invading North American wetlands. There is, however, little published evidence on establishment patterns of populations along lakes of the St. Lawrence River–Great Lakes watershed. We tested the hypothesis that the recent invasion of Great Lake Saint-François (Québec, Canada) by common reed was facilitated by a dense road system and by an intense residence construction activity along lakeshores. A total of 345 and 2914 reed stands were mapped along lakeshores, and along the road system of the study area, respectively. The probability of finding a reed stand on a lakeshore increases with the proximity of the lake's outlet, and of a paved road, but decreases with the proximity of a residence built since 1990. It is likely that common reed first spread along the road system, and that wind dispersal of seeds then favored the establishment of populations on lakeshores. Our model does not support the hypothesis that residential construction facilitated the establishment of reed stands, probably because the recent residential construction boom occurred essentially in the southern part of the lake, where the number of roadside reed populations is much lower than in the northern part (lower seed rain). The invasion of Great Lake Saint-François shows that the spread of the plant is not restricted to major river or road systems. Large or small lakes, if submitted to intense diaspore pressure, can also be at risk. 相似文献
5.
《Journal of Great Lakes research》2020,46(1):225-229
Corbicula fluminea and Dreissena polymorpha are two non-indigenous species (NIS), known to provoke biodiversity loss of the existent native communities and alterations in the ecosystem functioning structure. Both of these NIS have successfully founded well established populations in Lakes Maggiore and Garda (Northern Italy). Here, we evaluated the mitochondrial COI genetic diversity of C. fluminea and D. polymorpha populations from the aforementioned lakes. The COI gene analysis revealed one C. fluminea haplotype, belonging to the FW5 androgenetic invasive lineage. Two D. polymorpha haplotypes – LM1 and LM2 – were detected in Lake Maggiore. The D. polymorpha comparative phylogeographical haplotype analysis between Lake Maggiore and the retrieved COI data available from Lake Garda revealed that LM1 is the dominant haplotype in both populations, whereas LM2 a rare haplotype was only detected in Lake Maggiore. These findings contribute for a better understanding of the demographic history of these highly invasive species in these Italian lakes, thus suggesting that C. fluminea and D. polymorpha populations present a similar genetic pattern. The low genetic diversity detected in both of these bivalve populations seems to reflect a pre-existent low genetic pool prevenient from the introductory source(s). 相似文献
6.
求解三倍体个体单体型对于探索三倍体物种的遗传特性和表型差异等方面的研究具有重要的推动作用。针对带基因型信息的最少错误更正(MEC/GI)模型,提出了一种基于枚举策略的三倍体个体单体型重建算法EHTR。该算法依次重建3条单体型上的每一个单核苷酸多态性位点取值,对于给定位点,首先根据其基因型取值枚举该位点的3种单体型取值情况,然后选择片段支持度最高的取值作为该位点的重建值,算法的总时间复杂度为O(mn+mlogm+cnl)。采用CELSIM和MetaSim两种测序片段模拟生成器生成实验测试数据,在片段覆盖率、错误率、单片段长度、单体型长度和单体型海明距离等参数的不同设置下,对算法EHTR,GTIHR,W-GA和Q-PSO的重建率和运行时间进行对比分析。实验结果显示,算法EHTR在不同的参数设置下均能以更短的运行时间获得更高的重建率。 相似文献
7.
8.
Calpastatin (CAST) is a specific inhibitor of the ubiquitous calcium-dependent proteases – μ-calpain and m-calpain. This proteolytic system plays a key role in the tenderization process that occurs during postmortem storage of meat under refrigerated conditions. In the present study using comparative sequencing seven novel polymorphisms located within P3 promoter region for exon 1u of the bovine CAST gene: −357 (C/G), −556 (G/T), −557 (A/G), −580 (G/C), −750 (T/C), and two InDel at position −890 (A/−) and (GTT/−) at position −353/−351 were found. This region directs the expression of type III calpastatin mRNA, encoding the prototypical calpastatin. The genotype frequencies and haplotypes distribution were studied in 191 bulls belonging to six cattle breeds. All genotypes were distributed according to the HWE test and two major combined haplotypes were identified. The frequency of the haplotype1 varied from 0.45 in Aberdeen Angus to 0.82 in Simmental. 相似文献
9.
A domain where recombination does not occur often, yet maintained linkage disequilibrium exists on DNA sequence is known as a “haplotype block” or “LD block”. Many methods are available to identify LD blocks using disequilibrium parameters, such as the well-known Gabriel's method, Zhu's method and so on. We considered that Echelon analysis can be applied to identify LD block, and report herein that LD blocks can be identified according to our new method applying Echelon analysis. The results of numerical examples are also provided. 相似文献
10.
The knowledge of haplotypes allows researchers to identify the genetic variation affecting phenotypic such as health, disease and response to drugs. However, getting haplotype data by experimental methods is both time-consuming and expensive. Haplotype inference (HI) from the genotypes is a challenging problem in the genetics domain. There are several models for inferring haplotypes from genotypes, and one of the models is known as haplotype inference by pure parsimony (HIPP) which aims to minimize the number of distinct haplotypes used. The HIPP was proved to be an NP-hard problem. In this paper, a novel binary particle swarm optimization (BPSO) is proposed to solve the HIPP problem. The algorithm was tested on variety of simulated and real data sets, and compared with some current methods. The results showed that the method proposed in this paper can obtain the optimal solutions in most of the cases, i.e., it is a potentially powerful method for HIPP. 相似文献