首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Prolinx,® Inc. of Bothell, WA has developed the RapXtract™ 384 Dye Terminator Removal Kit for full automation of DNA sequencing reaction purification. The RapXtract product line is based upon proprietary superparamagnetic particle technology that eliminates the need for centrifugation, vacuum filtration, or modified primers to achieve purification of sequencing reactions. The kit described here is pre-dispensed in a 384-well microtiter plate and run on the TECAN GENESIS Workstation 150 (Tecan U.S. Inc., Research Triangle Park, NC). This system enables rapid purification of up to 384 sequencing reactions in a single run.As the completion of the Human Genome Project nears, it is imperative for biotechnology and pharmaceutical companies to increase throughput of DNA sequencing in order to be competitive in the drug discovery and validation process. The “race to market” requires a shift from standard DNA sequencing processes-including DNA sequencing reaction purification-towards complete walk-away automation for all steps.Existing sequencing reaction purification methods (Table 1) require considerable resources including: plastic and other laboratory consumables; specialized equipment, such as high-speed centrifuges or vacuum filtration apparatus; and labor-intensive protocols requiring large amounts of technician time. As a result, walk-away automation of standard purification methods is difficult and expensive.  相似文献   

2.
The problem of subclone identification for DNA fragments of a known nucleotide sequence has been considered. We suggest a strategy for rapid identification of a large number of subclones based on: (i) partial sequencing of the subclone DNA (single nucleotide track); (ii) representation of the result in the form of a numeric code showing the distribution of the chosen nucleotide along the sequence; and (iii) identification of the subclone sequence using this code in a catalogue compiled and printed for a whole DNA sequence. The same approach is applicable when the subclones are expected to have homology with known sequences.  相似文献   

3.
Demands for higher quantity and quality of sequence data during genome sequencing projects have led to a need for completely automated reagent systems designed to isolate, process, and analyze DNA samples. While much attention has been given to methodologies aimed at increasing the throughput of sample preparation and reaction setup, purification of the products of sequencing reactions has received less scrutiny despite the profound influence that purification has on sequence quality. Commonly used and commercially available sequencing reaction cleanup methods are not optimal for purifying sequencing reactions generated from larger templates, including bacterial artificial chromosomes (BACs) and those generated by rolling circle amplification. Theoretically, these methods would not remove the original template since they only exclude small molecules and retain large molecules in the sample. If the large template remains in the purified sample, it could understandably interfere with electrokinetic injection and capillary performance. We demonstrate that the use of MagneSil® paramagnetic particles (PMPs) to purify ABI PRISM® BigDye® sequencing reactions increases the quality and read length of sequences from large templates. The high-quality sequence data obtained by our procedure is independent of the size of template DNA used and can be completely automated on a variety of automated platforms.  相似文献   

4.
Using a single robotic platform, the GeneTAC™ G3, we have automated most of the processes involved in the cloning and characterisation of novel disease causing genes by addressing the following; firstly, identifying the BACs of interest and making shotgun libraries. Secondly, automating the set up of sequencing reactions using methodology that eliminates the need for DNA preparation of 384 clones. Thirdly, generating sublibraries using selective re-arraying of library clones to enable the determination of the entire genomic sequence of the gene. Fourthly, determining gene function by combination of differential screening and mini Northerns using microarrays printed using the GeneTAC™ G3 system and hybridised using the GeneTAC™ HybStation (Genomics Solutions, Ann Arbor, USA).  相似文献   

5.
导向定位测序(GPS)是一种全基因组DNA甲基化检测的新测序技术,产生的测序数据具有成本低、没有序列偏好等优势.目前,甲基化分析中最重要的一步是将其测序产生的序列比对到参考基因组上.但是,现有导向定位测序的方法使用Smith-Waterman进行局部序列比对,时间消耗过大且容易对序列比对位置产生误判.因此,提出一种导向定位测序数据的改进比对算法,该算法利用其双端测序的优势,先用甲基化序列端数据进行序列比对,对多位置匹配的序列再利用常规数据端数据进行比对位置确定.实验结果表明:本文方法和现有方法的准确率相当,而具有更高的唯一比对比率,时间性能有3倍以上的提升.  相似文献   

6.
Combinatorial algorithms for DNA sequence assembly   总被引:7,自引:0,他引:7  
The trend toward very large DNA sequencing projects, such as those being undertaken as part of the Human Genome Program, necessitates the development of efficient and precise algorithms for assembling a long DNA sequence from the fragments obtained by shotgun sequencing or other methods. The sequence reconstruction problem that we take as our formulation of DNA sequence assembly is a variation of the shortest common superstring problem, complicated by the presence of sequencing errors and reverse complements of fragments. Since the simpler superstring problem is NP-hard, any efficient reconstruction procedure must resort to heuristics. In this paper, however, a four-phase approach based on rigorous design criteria is presented, and has been found to be very accurate in practice. Our method is robust in the sense that it can accommodate high sequencing error rates, and list a series of alternate solutions in the event that several appear equally good. Moreover, it uses a limited form of multiple sequence alignment to detect, and often correct, errors in the data. Our combined algorithm has successfully reconstructed nonrepetitive sequences of length 50,000 sampled at error rates of as high as 10%.This research was supported by the National Library of Medicine under Grant R01-LM4960, by a postdoctoral fellowship from the Program in Mathematics and Molecular Biology of the University of California at Berkeley under National Science Foundation Grant DMS-8720208, and by a fellowship from the Centre de recherches mathématiques of the Université de Montréal.  相似文献   

7.
聂鹏宇  潘玮华  徐云 《计算机系统应用》2013,22(11):165-170,142
随着下一代测序技术的迅猛发展,宏基因组学已经成为新的研究热点,宏基因组学序列聚类问题使用无参考的方法,对包含多个物种的宏基因组序列进行有效分离.为此,提出一种结合相似度信息和结构信息的宏基因组物种聚类算法,并引入仿射聚类来进行序列物种聚类.实验数据表明该方法聚类精度高、执行速度快.我们也开发了基于该方法的宏基因组序列物种聚类软件.  相似文献   

8.
Pop  M. Salzberg  S.L. Shumway  M. 《Computer》2002,35(7):47-54
Ultimately, genome sequencing seeks to provide an organism's complete DNA sequence. Automation of DNA sequencing allowed scientists to decode entire genomes and gave birth to genomics, the analytic and comparative study of genomes. Although genomes can include billions of nucleotides, the chemical reactions researchers use to decode the DNA are accurate for only about 600 to 700 nucleotides at a time. The DNA reads that sequencing produces must then be assembled into a complete picture of the genome. Errors and certain DNA characteristics complicate assembly. Resolving these problems entails an additional and costly finishing phase that involves extensive human intervention. Assembly programs can dramatically reduce this cost by taking into account additional information obtained during finishing. The paper considers how algorithms that can assemble millions of DNA fragments into gene sequences underlie the current revolution in biotechnology, helping researchers build the growing database of complete genomes  相似文献   

9.
詹科  张云泉  王婷  郑晶晶  张鹏 《计算机科学》2015,42(1):90-91,100
高通量测序仪产生大量的DNA数据,FASTQ是被广泛使用的存储DNA数据的数据格式.对FASTQ格式的数据进行压缩处理,能有效地节省存储空间.DSRC算法具有压缩比高的优点,因此对DSRC算法进行并行能提高压缩FASTQ格式的DNA数据的效率.基于Pthreads,实现了并行DSRC算法.测试结果表明,当使用4线程时加速比达到3.5.  相似文献   

10.
Clustering is one of the major operations to analyse genome sequence data. Sophisticated sequencing technologies generate huge DNA sequence data; consequently, the complexity of analysing sequences is also increased. So, there is an enormous need for faster sequence analysis algorithms. Most of the existing tools focused on alignment-based approaches, which are slow-paced for sequence comparison. Alignment-free approaches are more successful for fast clustering. The state-of-the-art methods have been applied to cluster small genome sequences of various species; however, they are sensitive to large size sequences. To subdue this limitation, we propose a novel alignment-free method called DNA sequence clustering with map-reduce (DCMR). Initially, MapReduce paradigm is used to speed up the process of extracting eight different types of repeats. Then, the frequency of each type of repeat in a sequence is considered as a feature for clustering. Finally, K-means (DCMR-Kmeans) and K-median (DCMR-Kmedian) algorithms are used to cluster large DNA sequences by using extracted features. The two variants of proposed method are evaluated to cluster large genome sequences of 21 different species and the results show that sequences are very well clustered. Our method is tested for different benchmark data sets like viral genome, influenza A virus, mtDNA, and COXI data sets. Proposed method is compared with MeshClust, UCLUST, STARS, and ClustalW. DCMR-Kmeans outperforms MeshClust, UCLUST, and DCMR-Kmedian with respect to purity and NMI on virus data sets. The computational time of DCMR-Kmeans is less than STARS, DCMR-Kmedian, and much less than UCLUST on COXI data set.  相似文献   

11.
We present an approach to the gene identification phase of positional cloning that combines sparse sampling of DNA sequences from large genomic regions with computational analysis. We call the method "software trapping." The goal is to find coding exons while avoiding massive DNA sequence determination and contig assembly. Instead, rapid sequence sampling is combined with exon screening software such as a newly developed package called XPOUND to identify coding sequences. We have tested the approach using a set of model genomic sequences with known intron/exon structures as well as with bona fide P1 genomic clones. The results suggest that the strategy is a useful complement to other methods for finding genes in poorly characterized regions of genomes.  相似文献   

12.
This paper presents an implementation of steganography using DNA molecules. We first encode a plaintext message into a DNA sequence using a randomly generated single-substitution key. An oligonucleotide containing the encoded message, designated the message strand, is synthesized and mixed with a large amount of background DNA. To retrieve the message, the intended recipient must know the sequences of two primers that anneal to target regions present on the message strand. Polymerase chain reaction (PCR) and sequencing are used to retrieve the encoded sequence, which is decoded into the original plaintext via the single substitution key. This study shows that the steganographically hidden message can be retrieved only by using the two secret primers, meaning that the only applicable cryptanalytic approach is a brute-force search for the two primer sequences. Since each primer can have 420 different possible sequences, the amount of time required to crack DNA-based steganography is long enough to qualify the technique as essentially unbreakable.  相似文献   

13.
该文源于DNA序列杂交先后顺序的工程计算问题。在杂交先后顺序(SHB)问题中,人们试图想通过首先确定在一个很长的DNA字符串S中出现的k-长子串来了解整个原始的字符串S,通过研究k-长子串的重叠模式来重新构造原始的字符串S。该文将SHB问题转化为具体的图论问题。根据图及其线图的关系,部分解决了上述SHB问题的等价形式,即在有向线图顶点的入度和出度不超过2的情形下,用遍历理论为SHB问题建立了数学模型,从而能在多项式时间内找到有向线图的哈密顿路或圈。文章最后,指出了须进一步研究的问题。  相似文献   

14.
This technical paper describes the utilization of a new automated liquid handler from Beckman Coulter, Inc., the Biomek® NX Laboratory Automation Workstation, for genomic and proteomic applications. For genomic applications, methodology for plasmid DNA purification using Promega Wizard® SV 96 reagents was developed for the Biomek NX. A single plate of bacterial pellets can be processed to purified plasmid DNA without user interaction after initial setup. DNA quantity and quality were assessed by spectrophotometric analysis, restriction digestion, PCR (The PCR process is covered by patents owned by Roche Molecular Systems, Inc., and F. Hoffman La Roche, Ltd.), and capillary sequencing. Additionally, the plasmid preparation method was used to purify plasmid DNA from bacterial clones isolated in a bacterial two-hybrid screening procedure. In this case, the system quickly and efficiently prepared clones for rapid identification of target sequences. For proteomic applications, His-tag proteins were purified from bacterial cultures in a 96-well plate format. Following purification, a Bradford assay was used to determine the quantitative yields of the His-tag protein products in each of the aliquots from the purified samples. The AD 340 Automated Labware Positioner (ALP), an integrated absorbance reader, was used for absorbance measurements in the Bradford assay. Given the placement of this ALP on the deck of the Biomek NX, the entire process of protein purification and quantitation was performed in a complete walk-away automated format. Results obtained when purifying proteins, from both uninduced and induced bacterial cultures, on the worksurface of the Biomek NX will be described.  相似文献   

15.
The classical DNA sequencing by hybridization (SBH) uses a binary information about oligonucleotide presence in an analyzed DNA sequence. A given oligonucleotide is or is not a part of the sequence. However, the development of the DNA chip technology allows to take into consideration some information about repetitions in the target sequence. Currently, it is not possible to determine the exact data of such type but even partial multiplicity information should be very useful.In this paper two simple but realistic multiplicity information models are taken into account. The first one assumes that it is known if a given oligonucleotide occurs in the analyzed sequence once or more than once. According to the second model it is possible to determine if a given oligonucleotide appears in the target sequence once, twice or at least three times. A tabu search algorithm has been implemented to verify these models. It solves the problem with any kind of hybridization errors. Computational experiment results confirm that the additional information leads to an improvement of the reconstruction process. They also show that the more precise model of information increases the quality of the obtained solutions.Test data sets and the implemented tabu search algorithm are available on: http://bio.cs.put.poznan.pl/files/52234a7c9dfb89b808000001/download.  相似文献   

16.
根据网络结构中的连接关系得到节点的向量表示,进而将节点的向量表示应用于推荐算法可有效提升其建模能力。针对推荐系统中的同质网络,提出结合随机游走的网络表示学习推荐算法。以DeepWalk算法为基础,在随机游走过程中根据节点重要性设定节点游走序列数,并设置终止概率以控制游走长度优化采样结果,在网络表示学习过程中将SkipGram模型融合节点属性信息,同时考虑上下文节点离中心节点的距离获得更准确的推荐结果。实验结果表明,该算法相比DeepWalk、Node2vec等算法具有更高的推荐准确度,并且较好地解决了冷启动问题。  相似文献   

17.
I have designed a Macintosh data management system for molecular biologists. This system, called DataMinder, can be used to store information about oligonucleotides, nucleic acid or protein sequences, recombinant DNA clones, cells, reagents and protocols. DataMinder is not limited to data storage. A number of utilities for data analysis are provided, including those for the evaluation of oligonucleotides for use as hybridization probes or primers for DNA synthesis, and a variety of sequence editing features. Context-sensitive help is available on-line. DataMinder is simple to use and to customize and allows for sharing of database information across a computer network.  相似文献   

18.
An integrated family of amino acid sequence analysis programs   总被引:6,自引:0,他引:6  
During the last years abundant sequence data has become available due to the rapid progress in protein and DNA sequencing techniques. The exact three-dimensional structures, however, are available only for a fraction of proteins with known sequences. For many purposes the primary amino acid sequence of a protein can be directly used to predict important structural parameters. However, mathematical presentation of the calculated values often makes interpretation difficult, especially if many proteins must be analysed and compared. Here we introduce a broad-based, user-defined analysis of amino acid sequence information. The program package is based on published algorithms and is designed to access standard protein data bases, calculate hydropathy, surface probability and flexibility values and perform secondary structure predictions. The data output is in an 'easy-to-read' graphic format and several parameters can be superimposed within a single plot in order to simplify data interpretations. Additionally, this package includes a novel algorithm for the prediction of potential antigenic sites. Thus the software package presented here offers a powerful means of analysing an amino acid sequence for the purpose of structure/function studies as well as antigenic site analyses. These algorithms were written to function in context with the UWGCG (University of Wisconsin Genetics Computer Group) program collection, and are now distributed within that package.  相似文献   

19.
Finding the genes that exist within a DNA sequence and assigning them biological features and functions is one of the biggest challenges of Genomics. This task, called annotation, has to be as accurate and reliable as possible, because this information will be applied in other researches. Ideally, each sequence should be annotated and validated by a human expert, who has the knowledge to infer the most appropriate annotation. Nevertheless, the huge amount of genomic data produced by the new sequencing technologies prevents this practice. Developing expert systems that are able to annotate sequences automatically and emulate the expert involvement in certain key points of the process would enhance the annotation quality. In this work, the CommonKADS methodology is innovatively applied for this purpose. It is used to structure and model the knowledge required to build an expert system able to deal with the functional part of sequence annotation, i.e. establishing the biological purpose of the sequence. This approach provides the first general framework for the aforementioned problem, which can be easily extended to related issues.  相似文献   

20.
SNP(单核苷酸多态性)是发生在DNA序列上单个核苷酸碱基之间的变异,是生物可遗传变异中最常见的一种变异。ED算法和SNP-index算法是计算SNP位点的2种常用算法。由高通量测序获得拟南芥F2代全基因组测序数据,基于Linux平台对测序数据进行过滤、筛选和比对,通过算法实现结果,比较不同算法检测得到的SNP位点数量和SNP基因型比例。实验结果表明,通过ED算法得到的SNP位点数量更多,分布更广,相对分布密度大于SNP-index算法的,但是2种算法得到的SNP位点数量和SNP基因型比例相近。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号