首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
A new method, 'algorithmic significance', is proposed as a tool for discovery of patterns in DNA sequences. The main idea is that patterns can be discovered by finding ways to encode the observed data concisely. In this sense, the method can be viewed as a formal version of the Occam's Razor principle. In this paper the method is applied to discover significantly simple DNA sequences. We define DNA sequences to be simple if they contain repeated occurrences of certain 'words' and thus can be encoded in a small number of bits. Such definition includes minisatellites and microsatellites. A standard dynamic programming algorithm for data compression is applied to compute the minimal encoding lengths of sequences in linear time. An electronic mail server for identification of simple sequences based on the proposed method has been installed at the Internet address pythia/anl.gov.  相似文献   

We describe a new computer algorithm for finding low-energy conformations of proteins. It is a chain-growth method that uses a heuristic bias function to help assemble a hydrophobic core. We call it the Core-directed chain Growth method (CG). We test the CG method on several well-known literature examples of HP lattice model proteins [in which proteins are modeled as sequences of hydrophobic (H) and polar (P) monomers], ranging from 20-64 monomers in two dimensions, and up to 88-mers in three dimensions. Previous nonexhaustive methods--Monte Carlo, a Genetic Algorithm, Hydrophobic Zippers, and Contact Interactions--have been tried on these same model sequences. CG is substantially better at finding the global optima, and avoiding local optima, and it does so in comparable or shorter times. CG finds the global minimum energy of the longest HP lattice model chain for which the global optimum is known, a 3D 88-mer that has only been reachable before by the CHCC complete search method. CG has the potential advantage that it should have nonexponential scaling with chain length. We believe this is a promising method for conformational searching in protein folding algorithms.  相似文献   

Pleural and pulmonary malignancies are usually associated with well-known carcinogen exposure. Recently, the presence of simian virus 40 (SV40)-like DNA sequences has been detected in brain and bone-related human cancers and in pleural mesothelioma. In order to determine whether SV40-like DNA sequences are also present in bronchopulmonary carcinoma and non-malignant lung samples, 125 frozen pleural and pulmonary samples (including 21 mesotheliomas, 63 bronchopulmonary carcinomas, 8 other tumours, and 33 non-malignant samples) and 38 additional samples distant from tumours were studied for the occurrence of SV40-like DNA sequences by polymerase chain reaction (PCR) amplification followed by hybridization with specific probes. Sequences related to SV40 large T antigen (Tag) were present in 28.6 per cent of bronchopulmonary carcinomas, 47.6 per cent of mesotheliomas, and 16.0 per cent of cases with non-neoplastic pleural and pulmonary disease. No statistically significant difference in the occurrence of these DNA sequences was found between malignant mesothelioma and bronchopulmonary carcinoma, but a significantly higher number of mesothelioma cases exhibited SV40-like DNA sequences in comparison with cases of non-malignant pleural or pulmonary disease (P < 0.04). Among cases positive for SV40-like DNA sequences, a history of asbestos exposure was found in 3 out of 12 bronchopulmonary carcinomas and 8 out of 10 mesotheliomas. Immunohistochemistry using monoclonal antibodies directed against Tag did not demonstrate nuclear staining. The DNA sequences were not related to BK virus sequences, but three samples were positive with probes hybridizing with JC virus DNA sequences. In conclusion, this study demonstrates the presence of SV40-like DNA sequences in pulmonary neoplasms and in non-malignant lung tissues. It appears that the presence of SV40-like DNA is not unique to cancer.  相似文献   

INTRODUCTION: The most common fat-suppressed sequence used to study skeletal conditions is the STIR sequence which has shown high sensitivity in the detection of skeletal lesions and whose main drawback is its long acquisition time. Currently, Turbo-STIR (T-STIR) sequences can shorten the acquisition time. The purpose of this study was therefore to compare the conventional STIR sequence with the new T-STIR sequence in the study of skeletal conditions to compare their diagnostic yield. MATERIAL AND METHODS: Twenty patients with different types of skeletal lesions were examined. MR examinations were performed with a Philips Gyroscan S15/ACS II unit (1.5 T). All the patients underwent a STIR sequence (TR/TE = 1500/20, TI = 180 ms, matrix = 204 x 256, NEX = 2, slice thickness = 5 mm, acquisition time = 9 min 24 s) and a T-STIR sequence (TR/TE = 1500/20, TI = 180 ms, matrix = 204 x 256, NEX = 2, slice thickness = 5 mm, TFL = 3, acquisition time = 3 min 33 s). The images were evaluated by measuring both quantitative parameters--percent contrast (%C), contrast to noise ratio (C/N), signal to noise ratio (S/N)--and qualitative parameters--lesion conspicuity, margins and extension, motion artifacts, image quality. RESULTS: The only statistically significant difference between the two sequences was image quality, which was superior in the conventional STIR sequence (p < .05). No statistically significant difference was demonstrated with the quantitative evaluation. DISCUSSION: In this study, T-STIR sequences were performed with low-high acquisition profile to acquire an actual echo time of 20 ms which permits to obtain optimal S/N with good spatial resolution. Therefore, T-STIR sequences with low-high acquisition profile provides better results than T-STIR sequences with linear acquisition profile which permits to obtain an actual echo time of 40 ms. CONCLUSION: This work shows that T-STIR sequences can replace conventional STIR sequences in the study of skeletal conditions reducing the acquisition time by 60%. This result can be obtained only by an accurate optimization of acquisition parameters.  相似文献   

An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.  相似文献   

Given a strong match between regions of two sequences, how far can the match be meaningfully extended if gaps are allowed in the resulting alignment? The aim is to avoid searching beyond the point that a useful extension of the alignment is likely to be found. Without loss of generality, we can restrict attention to the suffixes of the sequences that follow the strong match, which leads to the following formal problem. Given two sequences and a fixed X > 0, align initial portions of the sequences subject to the constraint that no section of the alignment scores below -X. Our results indicate that computing an optimal alignment under this constraint is very expensive. However, less rigorous conditions on the alignment can be guaranteed by quite efficient algorithms. One of these variants has been implemented in a new release of the Blast suite of database search programs.  相似文献   

Heterotopic ossification (HO) after femoral intramedullary rodding is a significant complication of the procedure. One hundred eighteen cases of femoral roddings performed on 113 patients were available for review. The data were computerized and evaluated using univariate analysis and multivariate regression analysis. A statistically significant increase of HO was found with male gender, increased delay to surgery, and in patients requiring prolonged intubation because of their multiple injuries. HO was classified using a modified version of the method of Brumback et al. (grades 0-IV). A strong correlation of HO with brain injury documented by computed tomography scan was also found to be statistically significant for the more severe grades of HO. This group of patients had not previously been identified as being at high risk for HO.  相似文献   

A group of algorithms has been developed to investigate the characteristics of beat-to-beat intervals preceding and following the onset and termination of repeated pattern ventricular arrhythmias (RPVA) such as bigeminy and trigeminy. Eighty-five patients, each with more than 3000 ventricular ectopic beats in a 24-hour Holter recording and with more than 10 episodes of RPVA, were evaluated. A statistically significant prolongation of sinus intervals preceding the onset of bigeminy and trigeminy and shortening of postectopic intervals after the onset were observed. In addition, shortening of postectopic intervals before the termination of bigeminy and trigeminy and lengthening of sinus intervals following their termination were also seen. A significant presence of these characteristics was not observed in arrhythmias with a greater number of sinus beats between ectopic beats. These dynamics provide information which may be utilized in the assessment of mechanisms involved in the onset and termination of RPVA.  相似文献   

MOTIVATION: We have previously reported an algorithm for discovering patterns conserved in sets of related unaligned protein sequences. The algorithm was implemented in a program called Pratt. Pratt allows the user to define a class of patterns (e.g. the degree of ambiguity allowed and the length and number of gaps), and is then guaranteed to find the conserved patterns in this class scoring highest according to a defined fitness measure. In many cases, this version of Pratt was very efficient, but in other cases it was too time consuming to be applied. Hence, a more efficient algorithm was needed. RESULTS: In this paper, we describe a new and improved searching strategy that has two main advantages over the old strategy. First, it allows for easier integration with programs for multiple sequence alignment and data base search. Secondly, it makes it possible to use branch-and-bound search, and heuristics, to speed up the search. The new search strategy has been implemented in a new version of the Pratt program.  相似文献   

Computer simulations of simple exact lattice models are an aid in the study of protein folding process; they have sometimes resulted in predictions experimentally proved. The contact interactions (CI) method is here proposed as a new algorithm for the conformational search in the low-energy regions of protein chains modeled as copolymers of hydrophobic and polar monomers configured as self-avoiding walks on square or cubic lattices. It may be regarded as an extension of the standard Monte Carlo method improved by the concept of cooperativity deriving from nonlocal contact interactions. A major difference with respect to other algorithms is that criteria for the acceptance of new conformations generated during the simulations are not based on the energy of the entire molecule, but cooling factors associated with each residue define regions of the model protein with higher or lower mobility. Nine sequences of length ranging from 20 to 64 residues were used on the square lattice and 15 sequences of length ranging from 46 to 136 residues were used on the cubic lattice. The CI algorithm proved very efficient both in two and three dimensions, and allowed us to localize energy minima not localized by other searching algorithms described in the literature. Use of this algorithm is not limited to the conformational search, because it allows the exploration of thermodynamic and kinetic behavior of model protein chains.  相似文献   

The SRICOS method was proposed in 1999 to predict the scour depth versus time curve at a cylindrical bridge pier for a constant velocity flow, a uniform soil, and a deep-water condition. In this article, the method is extended to include a random velocity-time history and a multilayer soil stratigraphy; it is called the Extended-SRICOS or E-SRICOS. The algorithms to accumulate the effects of different velocities and to sequence through a series of soil layers are described. The procedure followed by the computer program to step into time is outlined. A simplified version of E-SRICOS called S-SRICOS is also presented; calculations for the S-SRICOS method can easily be done by hand. Eight bridges in Texas are used as case histories to compare predictions by the two new methods (E-SRICOS and S-SRICOS) with measurements at the bridge sites.  相似文献   

PURPOSE: This study was completed to determine the current knowledge and documentation patterns of nursing staff in the prevention of pressure ulcers and to identify the prevalence of pressure ulcers. METHODS: This pre-post intervention study was carried out in three phases. In phase 1, 67 nursing staff members completed a modified version of Bostrom's Patient Skin Integrity Survey. A Braden Scale score, the presence of actual skin breakdown, and the presence of nursing documentation were collected for each patient (n = 43). Phase II consisted of a 20-minute educational session to all staff. In phase III, 51 nursing staff completed a second questionnaire similar to that completed in phase I. Patient data (n = 49) were again collected using the same procedure as phase I. RESULTS: Twenty-seven staff members completed questionnaires in both phase I and phase III of the study. No statistically significant differences were found in the knowledge of the staff before or after the educational session. The number of patients with a documented plan of care showed a statistically significant difference from phase I to phase III. The number of patients with pressure ulcers or at risk for pressure ulcer development (determined by a Braden Scale score of 16 or less) did not differ statistically from phase I to phase III. CONCLUSION: Knowledge about pressure ulcers in this sample of staff nurses was for the most part current and consistent with the recommendations in the Agency for Health Care Policy and Research guideline. Documentation of pressure ulcer prevention and treatment improved after the educational session. Although a significant change was noted in documentation, it is unclear whether it reflected an actual change in practice.  相似文献   

A method is described for searching protein sequence databases using tandem mass spectra of tryptic peptides. The approach uses a de novo sequencing algorithm to derive a short list of possible sequence candidates which serve as query sequences in a subsequent homology-based database search routine. The sequencing algorithm employs a graph theory approach similar to previously described sequencing programs. In addition, amino acid composition, peptide sequence tags and incomplete or ambiguous Edman sequence data can be used to aid in the sequence determinations. Although sequencing of peptides from tandem mass spectra is possible, one of the frequently encountered difficulties is that several alternative sequences can be deduced from one spectrum. Most of the alternative sequences, however, are sufficiently similar for a homology-based sequence database search to be possible. Unfortunately, the available protein sequence database search algorithms (e.g. Blast or FASTA) require a single unambiguous sequence as input. Here we describe how the publicly available FASTA computer program was modified in order to search protein databases more effectively in spite of the ambiguities intrinsic in de novo peptide sequencing algorithms.  相似文献   

The relationship between the hidden periodicities in DNA sequences and the nucleosome units is investigated. It is shown that in the vicinity of lengths of about 200 bases there are statistically significant periodicities which remain approximately universal for exon-intron sequences both in the different genes and the different eukaryotic species. The additional analysis displays, nevertheless, that these approximately coincident universal periodicities can be generated by a variety of mechanisms. The relevance of the features observed to the structure of chromatin is discussed.  相似文献   

A new symmetric-iterative method for multiple alignment of protein sequences is presented. The method can be described as a combination of motif finding and dynamic programming procedures. It uses each sequence as a standard to which all sequences are aligned based on the significant segment pair alignment (SSPA) protocol. Sequences are further matched using a reduced scoring threshold to provide fillers and extensions between highly significant segment pair matches. The method produces alignment blocks that accommodate indels and are separated by variable-length unaligned segments. Construction of consensus sequences is iterative, assigning greater weights to more distantly related sequences. A consensus sequence and various measures of conservation at each aligned position can be used for comparisons between protein families, for data base searches, and for analysis of functional and evolutionary features. The method is illustrated on the extended family of prokaryotic and eukaryotic RecA-like sequences. The RecA-like sequences reveal extended alignments among eubacterial RecA and separately among eukaryotic/archaebacterial Rad51/RadA. Eleven conserved blocks are common to both groups, two of them encompassing the ATP-binding A and B-sites. Among the most conserved positions are glycine residues. For example, they occur twice as doublets putatively serving as hinge connections that provide opportunity for alternative structural conformations. Also several charged/polar residues are highly conserved, probably consequent upon the extensive intermonomer interactions in RecA/Rad51 filament formation and possibly relevant protein-protein and protein-nucleic acid interactions.  相似文献   

The National Center for Biotechnology Information (NCBI), part of the National Library of Medicine, was established in 1988 to perform basic research in the field of computational molecular biology as well as build and distribute molecular biology databases. The basic research has led to new algorithms and analysis tools for interpreting genomic data and has been instrumental in the discovery of human disease genes for neurofibromatosis and Kallmann syndrome. The principal database responsibility is the National Institutes of Health (NIH) genetic sequence database, GenBank. NCBI, in collaboration with international partners, builds, distributes, and provides online and CD-ROM access to over 112,000 DNA sequences. Another major program is the integration of multiple sequences databases and related bibliographic information and the development of network-based retrieval systems for Internet access.  相似文献   

Mapping nucleotide sequences onto a "DNA walk" produces a novel representation of DNA that can then be studied quantitatively using techniques derived from fractal landscape analysis. We used this method to analyze 11 complete genomic and cDNA myosin heavy chain (MHC) sequences belonging to 8 different species. Our analysis suggests an increase in fractal complexity for MHC genes with evolution with vertebrate > invertebrate > yeast. The increase in complexity is measured by the presence of long-range power-law correlations, which are quantified by the scaling exponent alpha. We develop a simple iterative model, based on known properties of polymeric sequences, that generates long-range nucleotide correlations from an initially noncorrelated coding region. This new model-as well as the DNA walk analysis-both support the intron-late theory of gene evolution.  相似文献   

In this paper, we study numerically the two- and three-dimensional nonlinear dynamic response of a chain hanging under its own weight. Previous authors have employed the box method, a finite-difference scheme popular in cable dynamics problems, for this purpose. The box method has significant stability problems, however, and thus is not well suited to this highly nonlinear problem. We illustrate these stability problems and propose a new time integration procedure based on the generalized-α method. The new method exhibits superior stability properties compared to the box method and other algorithms such as backward differences and trapezoidal rule. Of four time integration methods tested, the generalized-α algorithm was the only method that produced a stable solution for the three-dimensional whirling motions of a hanging chain driven by harmonic linear horizontal motion at the top.  相似文献   

We have developed a rapid and highly sensitive method for the detection of mutant K-ras codon 12 allele in the presence of 10(5) copies of the wild-type alleles. This sensitivity is achieved by selective amplification of mutant K-ras sequences, using a two-stage procedure with modified primers. In the first stage, primers consist of K-ras sequences in the 3' portion and polyomavirus sequence (to minimize homology with human genome) on the 5' portion. The 3' portion also consists of mismatch sequence that generates an MvaI site in normal, but not mutant, K-ras codon 12 alleles. Thus, following the first round of 20 cycles, restriction enzyme cleavage is carried out to selectively digest normal K-ras codon 12 alleles. To enrich mutant alleles, a second amplification is performed using tail primers that recognize the polyoma, but not human sequences. This design ensures that in the second amplification only mutant alleles that were pre-amplified in the first round would serve as template for this reaction. Ethidium bromide-stained polyacrylamide gel electrophoresis (PAGE) of second-stage PCR product that has been digested with MvaI is used to monitor the presence of mutant alleles, detected at sensitivity of 1/10(5). This technique offers high sensitive detection of mutant K-ras alleles using a new concept of tail-primer design and is likely to assist in identifying patients at risk to develop pancreatic, colon, or lung cancer, which harbor high incidence of mutant ras alleles.  相似文献   

SiMultaneous Acquisition of Spatial Harmonics (SMASH) is a new fast-imaging technique that increases MR image acquisition speed by an integer factor over existing fast-imaging methods, without significant sacrifices in spatial resolution or signal-to-noise ratio. Image acquisition time is reduced by exploiting spatial information inherent in the geometry of a surface coil array to substitute for some of the phase encoding usually produced by magnetic field gradients. This allows for partially parallel image acquisitions using many of the existing fast-imaging sequences. Unlike the data combination algorithms of prior proposals for parallel imaging, SMASH reconstruction involves a small set of MR signal combinations prior to Fourier transformation, which can be advantageous for artifact handling and practical implementation. A twofold savings in image acquisition time is demonstrated here using commercial phased array coils on two different MR-imaging systems. Larger time savings factors can be expected for appropriate coil designs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号