首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a generalized Lévy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Lévy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Lévy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.  相似文献   

2.
An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.  相似文献   

3.
Mapping nucleotide sequences onto a "DNA walk" produces a novel representation of DNA that can then be studied quantitatively using techniques derived from fractal landscape analysis. We used this method to analyze 11 complete genomic and cDNA myosin heavy chain (MHC) sequences belonging to 8 different species. Our analysis suggests an increase in fractal complexity for MHC genes with evolution with vertebrate > invertebrate > yeast. The increase in complexity is measured by the presence of long-range power-law correlations, which are quantified by the scaling exponent alpha. We develop a simple iterative model, based on known properties of polymeric sequences, that generates long-range nucleotide correlations from an initially noncorrelated coding region. This new model-as well as the DNA walk analysis-both support the intron-late theory of gene evolution.  相似文献   

4.
We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.  相似文献   

5.
RNA free energy landscapes are analysed by means of "time-series" that are obtained from random walks restricted to excursion sets. The power spectra, the scaling of the jump size distribution, and the scaling of the curve length measured with different yard stick lengths are used to describe the structure of these "time series". Although they are stationary by construction, we find that their local behavior is consistent with both AR(1) and self-affine processes. Random walks confined to excursion sets (i.e., with the restriction that the fitness value exceeds a certain threshold at each step) exhibit essentially the same statistics as free random walks. We find that an AR(1) time series is in general approximately self-affine on timescales up to approximately the correlation length. We present an empirical relation between the correlation parameter rho of the AR(1) model and the exponents characterizing self-affinity.  相似文献   

6.
Heart rate variability signals obtained from 24 h recordings are analyzed for normal and pathological subjects. This time series contains information about the autonomic nervous system action regulating the beat-to-beat heart rate. Nonlinear contributions to the long period variability have been assessed by the calculation of the entire spectrum of Lyapunov exponents, after the system trajectory reconstruction, starting from the original variability signal. The positivity of Lyapunov exponent values, obtained from an unknown process, can establish whether the structure generating it shows nonlinear chaotic characteristics. This is what happens for the cardiovascular signals. Moreover, the different values obtained for the Lyapunov exponents operate a classification among the considered pathophysiological cases.  相似文献   

7.
Computer simulations of normal grain growth are presented. These simulations are stochastic in nature, taking account of the migration probabilities of the grain boundary atoms. The mean field approximation is used to simplify the dynamics of the network. Results for the surface tension driven growth agree well with analytical theory, reinforcing the dynamical interpeetation of the system that is used here. The simulation technique is easily adapted to model the effect of other driving mechanisms. We investigate the role of thermal fluctuations on the system and find that they suppress the growth in a deterministic way. We discuss the experimental observations and highlight the need for a good microscopic model of the boundary migration mechanism to predict the true growth exponent expected in real materials.  相似文献   

8.
The kinetics of DNA hairpin-loop fluctuations has been investigated by using a combination of fluorescence energy transfer and fluorescence correlation spectroscopy. We measure the chemical rates and the activation energies associated with the opening and the closing of the hairpin for different sizes and sequences of the loop and for various salt concentrations. The rate of unzipping of the hairpin stem is essentially independent of the characteristics of the loop, whereas the rate of closing varies greatly with the loop length and sequence. The closing rate scales with the loop length, with an exponent 2.6 +/- 0.3. The closing rate is increased at higher salt concentrations. For hairpin closing, a loop of adenosine repeats leads to smaller rates and higher activation energies than a loop with thymine repeats.  相似文献   

9.
The goal of this study is to quantify and determine the way in which the emotional response to music is reflected in the electrical activities of the brain. When the power spectrum of sequences of musical notes is inversely proportional to the frequency on a log-log plot, we call it 1/f music. According to previous research, most listeners agree that 1/f music is much more pleasing than white (1/f0) or brown (1/f2) music. Based on these studies, we used nonlinear methods to investigate the chaotic dynamics of electroencephalograms (EEGs) elicited by computer-generated 1/f music, white music, and brown music. In this analysis, we used the correlation dimension and the largest Lyapunov exponent as measures of complexity and chaos. We developed a new method that is strikingly faster and more accurate than other algorithms for calculating the nonlinear invariant measures from limited noisy data. At the right temporal lobe, 1/f music elicited lower values of both the correlation dimension and the largest Lyapunov exponent than white or brown music. We observed that brains which feel more pleased show decreased chaotic electrophysiological behavior. By observing that the nonlinear invariant measures for the 1/f distribution of the rhythm with the melody kept constant are lower than those for the 1/f distribution of melody with the rhythm kept constant, we could conclude that the rhythm variations contribute much more to a pleasing response to music than the melody variations do. These results support the assumption that chaos plays an important role in brain function, especially emotion.  相似文献   

10.
Bovine pancreatic ribonuclease is a DNA "melting" protein, since it binds with greater overall affinity to the single-stranded than to the double-stranded form of natural and synthetic deoxyribose-containing polynucleotides. As such, the DNA-RNase system provides a simple model for the more complex and biologically relevant melting protein-nucleic acid systems. Aspects of the DNA-RNase interactions which are related to the quantitative assessment of this system as a melting protein model are investigated here. A boundary sedimentation velocity technique is used to measure thermodynamic parameters of the interaction; association constants (Kh and Kc) and site sizes (nh and nc) are determined for the interaction of ribonuclease with native (double helical) and denatured (random coil) DNA. It is shown that log Kh and log Kc are linear functions of log [Na+], binding decreasing with increasing Na+ concentration, with Kh about 2 orders of magnitude smaller than Kc at the ionic strengths studied, nh and nc are approximately 8 and approximately 11 nucleotide residues, respectively, indicating that potential binding sites overlap. Binding to both forms of DNA is non-cooperative. It is shown by CD and ultraviolet spectroscopy that the binding of RNase to single- and double-stranded DNA perturbs the conformations of these polynucleotide conformations very little relative to the unliganded structures. Hydrodynamic methods are used to show that RNase binds to native DNA without altering the overall solution structure of the latter; however conditons which permit binding to, and stabilization of, transiently exposed single-stranded sequences result in a collapse of the stiff native DNA structure. We demonstrate by melting transition studies that ribonuclease does bring about an equilibrium destabilization of native DNA and poly [d(A-T)] and, by applying a ligand-perturbed helic in equilibrium coil theory developed by McGhee (McGhee, J.D. (1976) Biopolymers 15, 1345-1375), it is shown that the extent of the observed destabilization is in semiquantitative accord with expectations based on the measured affinity constants and site sizes for RNase binding to both DNA conformations. Spectral methods are used to show that the relative stability of native DNA sequences of varying base composition is the same in the presence and absence of ribonuclease, strongly arguing that this "melting" ligand "traps" single-stranded sequences transiently exposed by thermal fluctuations. RNase also undergoes an order in equilibrium disorder conformational transition as a function of temperature (the denatured form of RNase stabilizes native DNA, while native RNase destabilizes the native double helix), and the coupled equilibria involved in these interacting conformational changes are interpreted and discussed as possible models of genome regulatory interactions.  相似文献   

11.
The behaviour of an isolated polymer floating in a solvent forms the basis of our understanding of polymer dynamics. Classical theories describe the motion of a polymer with linear equations of motion, which yield a set of 'normal modes', analogous to the fundamental frequency and the harmonics of a vibrating violin string. But hydrodynamic interactions make polymer dynamics inherently nonlinear, and the linearizing approximations required for the normal-mode picture have therefore been questioned. Here we test the normal-mode theory by measuring the fluctuations of single molecules of DNA held in a partially extended state with optical tweezers. We find that the motion of the DNA can be described by linearly independent normal modes, and we have experimentally determined the eigenstates of the system. Furthermore, we show that the spectrum of relaxation times obeys a power law.  相似文献   

12.
13.
Heterologous DNA sequences from rearrangements with the genomes of host cells, genomic fragments from hybrid cells, or impure tissue sources can threaten the purity of libraries that are derived from RNA or DNA. Hybridization methods can only detect contaminants from known or suspected heterologous sources, and whole library screening is technically very difficult. Detection of contaminating heterologous clones by sequence alignment is only possible when related sequences are present in a known database. We have developed a statistical test to identify heterologous sequences that is based on the differences in hexamer composition of DNA from different organisms. This test does not require that sequences similar to potential heterologous contaminants are present in the database, and can in principle detect contamination by previously unknown organisms. We have applied this test to the major public expressed sequence tag (EST) data sets to evaluate its utility as a quality control measure and a peer evaluation tool. There is detectable heterogeneity in most human and C.elegans EST data sets but it is not apparently associated with cross-species contamination. However, there is direct evidence for both yeast and bacterial sequence contamination in some public database sequences annotated as human. Results obtained with the hexamer test have been confirmed with similarity searches using sequences from the relevant data sets.  相似文献   

14.
We study the fluctuations of native proteins by exact enumeration using the HP lattice model. The model fluctuations increase with temperature. We observe a low-temperature point, below which large fluctuations are frozen out. This prediction is consistent with the observation by Tilton et al. [R. F. Tilton, Jr., J. C. Dewan, and G. A. Petsko, Biochemistry 31, 2469 (1992)], that the thermal motions of ribonuclease A increase sharply above about 200 K. We also explore protein "flexibility" as defined by Debye-Waller-like factors and solvent accessibilities of core residues to hydrogen exchange. We find that proteins having greater stability tend to have fewer large fluctuations, and hence lower flexibilities. If flexibility is necessary for enzyme catalysis, this could explain why proteins from thermophilic organisms, which are exceptionally stable, may be catalytically inactive at normal temperatures.  相似文献   

15.
For a minimalist model of protein folding, which we introduced recently, we investigate various methods to obtain folding sequences. A detailed study of random sequences shows that, for this model, such sequences usually do not fold to their ground states during simulations. Straight-forward techniques for the construction of folding sequences, based solely on the target structure, fail. We describe in detail an optimization algorithm, based on genetic algorithms, for the "simulated breeding" of folding sequences in this model. We find that, for any target structure studied, there is not only a single folding sequence but a patch of sequences in sequence space that fold to this structure. In addition, we show that, much as in real proteins, nonhomologous sequences may fold to the same target structure.  相似文献   

16.
17.
A genetic hypothesis for a disease presupposes the existence of variation in the DNA sequences of affected individuals. A series of techniques known together as "mutational analysis" can be applied towards identifying new sequence variations in selected genes. These techniques can screen a large series of individuals for mutations efficiently, so it is not necessary to determine the nucleotide sequence in every DNA sample. DNA samples suspected of harboring sequence variants are then sequenced. Denaturing gradient gel electrophoresis techniques, single stranded conformation polymorphism paradigms, and chemical cleavage of mismatches are 3 procedures widely used for the molecular screening of mutations today. We discuss each of these techniques for mutation screening.  相似文献   

18.
RAPD-PCR is a new technique that, starting from genomic DNA allows, with the use of a single primer of "random" base composition to amplify a variable number of sequences that can give important informations if analyzed for linkage studies, gene mapping or phylogenetic purposes. In order to detect the possible application of this simple way of DNA-fingerprinting in individual identification and in cell lineages characterization we analyzed human and non-human Primates DNA. Six different single primers of variable length were used and resulted in individual or specific electrophoretic patterns. As already reported we found a better resolution using "short" primers. The individual electrophoretic patterns obtained by RAPD-PCR can be a simple and reliable approach to DNA analysis.  相似文献   

19.
Intraspecific allometric modeling (Y = a.mass(b), where Y is the physiological dependent variable and a is the proportionality coefficient) of peak oxygen uptake (VO2peak) has frequently revealed a mass exponent (b) greater than that predicted from dimensionality theory, approximating Kleiber's 3/4 exponent for basal metabolic rate. Nevill (J. Appl. Physiol. 77: 2,870-2,873, 1994) proposed an explanation and a method that restores the inflated exponent to the anticipated 2/3. In human subjects, the method involves the addition of "stature" as a continuous predictor variable in a multiple log-linear aggression model: ln Y = a + c. ln stature + b. ln mass + ln epsilon, where c is the general body size exponent and epsilon is the error term. It is likely that serious collinearity confounds may adversely affect the reliability and validity of the model. The aim of this study was to critically examine Nevill's method in modeling VO2peak in prepubertal, teenage, and adult men. A mean exponent of 0.81 (95% confidence interval, 0.65-0.97) was found when scaling by mass alone. Nevill's method reduced the mean mass exponent to 0.67 (95% confidence interval, 0.44-0.9). However, variance inflation factors and tolerance for the log-transformed stature and mass variables exceeded published criteria for severe collinearity. Principal components analysis also diagnosed severe collinearity in two principal components, with condition indexes > 30 and variance decomposition proportions exceeding 50% for two regression coefficients. The derived exponents may thus be numerically inaccurate and unstable. In conclusion, the restoration of the mean mass exponent to the anticipated 2/3 may be a fortuitous statistical artifact.  相似文献   

20.
Exponents of the psychophysical function for subjective duration are compiled from 111 studies undertaken with the methods of magnitude estimation, magnitude production, and ratio setting. The determination of exponents from ratio-setting data is based on a new model for time perception that also allows the computation of exponents from equal-setting (duration reproduction) data (i.e., from experiments that did not involve the S's numerical behavior). The following problems are dealt with in terms of their influence on the exponent of subjective duration: practice, sensory modality (used in presenting the duration, including empty intervals), drugs, group differences (age, mental retardation, psychosis, neurosis), and experimental effects (methods, very short durations, intramodal range effect). A general conclusion is that time perception is not veridical and that the exponent on the average approximates 0.9. (4 p ref) (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号