首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
An open question in computational molecular biology is whether long-range correlations are present in both coding and noncoding DNA or only in the latter. To answer this question, we consider all 33301 coding and all 29453 noncoding eukaryotic sequences--each of length larger than 512 base pairs (bp)--in the present release of the GenBank to dtermine whether there is any statistically significant distinction in their long-range correlation properties. Standard fast Fourier transform (FFT) analysis indicates that coding sequences have practically no correlations in the range from 10 bp to 100 bp (spectral exponent beta=0.00 +/- 0.04, where the uncertainty is two standard deviations). In contrast, for noncoding sequences, the average value of the spectral exponent beta is positive (0.16 +/- 0.05) which unambiguously shows the presence of long-range correlations. We also separately analyze the 874 coding and the 1157 noncoding sequences that have more than 4096 bp and find a larger region of power-law behavior. We calculate the probability that these two data sets (coding and noncoding) were drawn from the same distribution and we find that it is less than 10(-10). We obtain independent confirmation of these findings using the method of detrended fluctuation analysis (DFA), which is designed to treat sequences with statistical heterogeneity, such as DNA's known mosaic structure ("patchiness") arising from the nonstationarity of nucleotide concentration. The near-perfect agreement between the two independent analysis methods, FFT and DFA, increases the confidence in the reliability of our conclusion.  相似文献   

2.
We propose a generalized Lévy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Lévy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Lévy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.  相似文献   

3.
While investigating the copy number of minichromosome Dp(1;f)1187 sequences in the polyploid chromosomes of ovarian nurse and follicle cells of Drosophila melanogaster we discovered that restriction fragments spanning the euchromatic-heterochromatic junction of the chromosome and extending into peri-centromeric sequences had the unusual property of being selectively resistant to transfer out of agarose gels during Southern blotting, leading to systematic reductions in Dp1187-specific hybridization signals. This property originated from the peri-centromeric sequences contained on the junction fragments and was persistently associated with Dp1187 DNA, despite attempts to ameliorate the effect by altering experimental protocols. Transfer inhibition was unlikely to be caused by an inherent physical property of repetitive DNA sequences since, in contrast to genomic DNA, cloned restriction fragments spanning the euchromatic-heterochromatic junction and containing repetitive sequences transferred normally. Finally, the degree of inhibition could be suppressed by the addition of a Y chromosome to the genotype. On the basis of these observations and the fact that peri-centromeric regions of most eukaryotic chromosomes are associated with cytologically and genetically defined heterochromatin, we propose that peri-centromeric sequences of Dp1187 that are incorporated into heterochromatin in vivo retain some component of heterochromatic structure during DNA isolation, perhaps a tightly bound protein or DNA modification, which subsequently causes the unorthodox properties observed in vitro.  相似文献   

4.
We analyze the fluctuations in the correlation exponents obtained for noncoding DNA sequences. We find prominent sample-to-sample variations as well as variations within a single sample in the scaling exponent. To determine if these fluctuations may result from finite system size, we generate correlated random sequences of comparable length and study the fluctuations in this control system. We find that the DNA exponent fluctuations are consistent with those obtained from the control sequences having long-range power-law correlations. Finally, we compare our exponents for the DNA sequences with the exponents obtained from power-spectrum analysis and correlation-function techniques, and demonstrate that the original "DNA-walk" method is intrinsically more accurate due to reduced noise.  相似文献   

5.
Optimal gene therapy for many disorders will require efficient transfer to cells in vivo, high-level and long-term expression, and tissue-specific regulation, all in the absence of significant toxicity or inflammatory responses. While recombinant adenoviral vectors are efficient for gene transfer to hepatocytes, their usefulness is limited by short duration of expression related, at least in part, to immune responses to viral proteins and by a low capacity for foreign DNA. A number of systems have been developed for producing adenoviral vectors devoid of all viral coding sequences. Using AdSTK109, a vector lacking all viral coding sequences and carrying the complete human alpha1-antitrypsin (hAAT) genomic DNA locus, we have demonstrated sustained expression for longer than 10 months in mice. Utilizing high doses of this vector for hepatic gene transfer in mice, we find that supraphysiological levels of hAAT can be achieved without hepatotoxicity.  相似文献   

6.
The transformation-associated recombination (TAR) procedure allows rapid, site-directed cloning of specific human chromosomal regions as yeast artificial chromosomes (YACs). The procedure requires knowledge of only a single, relatively small genomic sequence that resides adjacent to the chromosomal region of interest. We applied this approach to the cloning of the neocentromere DNA of a marker chromosome that we have previously shown to have originated through the activation of a latent centromere at human chromosome 10q25. Using a unique 1.4-kb DNA fragment as a "hook" in TAR experiments, we achieved single-step isolation of the critical neocentromere DNA region as two stable, 110- and 80-kb circular YACs. For obtaining large quantities of highly purified DNA, these YACs were retrofitted with the yeast-bacteria-mammalian-cells shuttle vector BRV1, electroporated into Escherichia coli DH10B, and isolated as bacterial artificial chromosomes (BACs). Extensive characterization of these YACs and BACs by PCR and restriction analyses revealed that they are identical to the corresponding regions of the normal chromosome 10 and provided further support for the formation of the neocentromere within the marker chromosome through epigenetic activation.  相似文献   

7.
The viral polymerase and several cis-acting sequences are essential for hepadnaviral DNA replication, but additional host factors are likely to be involved in this process. We previously identified two sequences, UBS and DBS (upstream and downstream binding sites), present in multiple copies in and adjacent to the pregenomic RNA (pgRNA) terminal redundancy, that were specifically recognized by a 65-kDa host factor, p65. The possible roles of these two sequences in hepatitis B virus (HBV) replication were investigated in the context of the intact viral genome. UBS is contained within the terminal redundancy of pgRNA, and the 5' copy of this sequence is essential for viral replication. Mutations within the central core of UBS ablate p65 binding and selectively block synthesis of plus-strand DNA, without affecting RNA packaging or minus-strand synthesis. The DBS sequence, which is located downstream of the pgRNA polyadenylation site, overlaps the core (C) protein coding region. All mutations introduced into this site severely affected viral replication. However, these effects were shown to result from dominant negative effects of mutant core polypeptides rather than from cis-acting effects on RNA recognition. Thus, the 5' UBS but not DBS sites play important cis-acting roles in HBV DNA replication; however, the involvement of p65 in these roles remains a matter for investigation.  相似文献   

8.
Using a recently developed polymerase chain reaction (PCR)-mediated approach for physical mapping of single-copy DNA sequences on microisolated chromosomes of barley, sequence-tagged sites of DNA probes that reveal restriction fragment length polymorphisms (RFLP) localized on the linkage maps of rice chromosomes 5 and 10 were allocated to cytologically defined regions of barley chromosome 5 (1H). The rice map of linkage group 5, of about 135 cM in size, falls into two separate parts, which are related to the distal portions of both the short and long arms of the barley chromosome. The markers on the rice map of chromosome 5 were found to be located within regions of the barley chromosome which show high recombination rates. The map of rice chromosome 10, of about 75 cM in size, on the other hand, is related to an interstitial segment of the long arm of chromosome 5 (1H) which is highly suppressed in recombination activity. For positional cloning of genes of this homoeologous region from the barley genome, the small rice genome will probably prove to be a useful tool. No markers located on rice chromosomes were detected within the pericentric Giemsa-positive heterochromatin of the barley chromosome, indicating that these barley-specific sequences form a block which separates the linkage segments conserved in rice. By our estimate approximately half of the barley-specific sequences of chromosome 5 (1H) show a dispersed distribution, while the other half separates the conserved linkage segments.  相似文献   

9.
The Afa-family repetitive sequences were isolated from barley (Hordeum vulgare, 2n = 14) and cloned as pHvA14. This sequence distinguished each barely chromosome by in situ hybridization. Double color fluorescence in situ hybridization using pHvA14 and 5S rDNA or HvRT-family sequence (subtelomeric sequence of barley) allocated individual barley chromosomes showing a specific pattern of pHvA14 to chromosome 1H to 7H. As the case of the D genome chromosomes of Aegilops squarrosa and common wheat (Triticum aestivum) hybridized by its Afa-family sequences, the signals of pHvA14 in barley chromosomes tended to appear in the distal regions that do not carry many chromosome band markers. In the telomeric regions these signals always placed in more proximal portions than those of HvRT-family. Based on the distribution patterns of Afa-family sequences in the chromosomes of barley and D genome chromosomes of wheat, we discuss a possible mechanism of amplification of the repetitive sequences during the evolution of Triticeae. In addition, we show here that HvRT-family also could be used to distinguish individual barley chromosomes from the patterns of in situ hybridization.  相似文献   

10.
Plasmodium falciparum malaria parasites were transformed with plasmids containing P. falciparum or Toxoplasma gondii dihydrofolate reductase-thymidylate synthase (dhfr-ts) coding sequences that confer resistance to pyrimethamine. Under pyrimethamine pressure, transformed parasites were obtained that maintained the transfected plasmids as unrearranged episomes for several weeks. These parasite populations were replaced after 2 to 3 months by parasites that had incorporated the transfected DNA into nuclear chromosomes. Depending upon the particular construct used for transformation, homologous integration was detected in the P. falciparum dhfr-ts locus (chromosome 4) or in hrp3 and hrp2 sequences that were used in the plasmid constructs as gene control regions (chromosomes 13 and 8, respectively). Transformation by homologous integration sets the stage for targeted gene alterations and knock-outs that will advance understanding of P. falciparum.  相似文献   

11.
We have used the asymmetry between the coding and noncoding strands in different codon positions of coding sequences of DNA as a parameter to evaluate the coding probability for open reading frames (ORFs). The method enables an approximation of the total number of coding ORFs in the set of analyzed sequences as well as an estimation of the coding probability for the ORFs. The asymmetry observed in the nucleotide composition of codons in coding sequences has been used successfully for analysis of the genomes completed at the time of this analysis.  相似文献   

12.
HeT-A, a major component of Drosophila telomeres, is the first retrotransposon proposed to have a vital cellular function. Unlike most retrotransposons, more than half of its genome is noncoding. The 3' end contains > 2.5 kb of noncoding sequence. Copies of HeT-A differ by insertions or deletions and multiple nucleotide changes, which initially led us to conclude that HeT-A noncoding sequences are very fluid. However, we can now report, on the basis of new sequences and further analyses, that most of these differences are due to the existence of a small number of conserved sequence subfamilies, not to extensive sequence change during each transposition event. The high level of sequence conservation within subfamilies suggests that they arise from a small number of replicatively active elements. All HeT-A subfamilies show preservation of two intriguing features. First, segments of extremely A-rich sequence form a distinctive pattern within the 3' noncoding region. Second, there is a strong strand bias of nucleotide composition: The DNA strand running 5' to 3' toward the middle of the chromosome is unusually rich in adenine and unusually poor in guanine. Although not faced with the constraints of coding sequences, the HeT-A 3' noncoding sequence appears to be under other evolutionary constraints, possibly reflecting its roles in the telomeres.  相似文献   

13.
In the chironomid Acricotopus lucidus, parts of the genome, the germ line-limited chromosomes, are eliminated from the future soma cells during early cleavage divisions. A highly repetitive, germ line-specific DNA sequence family was isolated, cloned and sequenced. The monomers of the tandemly repeated sequences range in size from 175 to 184 bp. Analysis of sequence variation allowed the further classification of the germ line-restricted repetitive DNA into two related subfamilies, A and B. Fluorescence in situ hybridization to gonial metaphases demonstrated that the sequence family is highly specific for the paracentromeric heterochromatin of the germ line-limited chromosomes. Restriction analysis of genomic soma DNA of A. lucidus revealed another tandem repetitive DNA sequence family with monomers of about 175 bp in length. These DNA elements are found only in the centromeric regions of all soma chromosomes and one exceptional germ line-limited chromosome by in situ hybridization to polytene soma chromosomes and gonial metaphase chromosomes. The sequences described here may be involved in recognition, distinction and behavior of soma and germ line-limited chromosomes during the complex chromosome cycle in A. lucidus and may be useful for the genetic and cytological analysis of the processes of elimination of the germ line-limited chromosomes in the soma and germ line.  相似文献   

14.
In swine, distinct centromeric satellite DNA families have been described that correspond to either all the metacentric chromosomes except the Y (Mc1) or all the acrocentric chromosomes (Ac2). Using primed in situ (PRINS) labeling, we show here that primers derived from various sequences specifically label the centromeres of different subgroups of chromosomes. Among five primers derived from centromeric sequences of acrocentric chromosomes reported to be very homogeneous, four recognize all the acrocentric chromosomes, whereas one labels prominently chromosome 17. For the metacentric chromosomes, six primers have been derived from several divergent sequences. Among these primers, two recognize all the metacentric chromosomes except 5, 10, and 12. Three other primers label small subsets of metacentric chromosomes, including the X and one or two additional chromosomes. The last primer is specific to chromosome 1. These preliminary results suggest that it should be possible to define specific primers for almost every swine chromosome. Already, some of the primers reported here permit a distinction between swine chromosomes difficult to differentiate without banding, such as the X chromosome and chromosome 9. Therefore, the PRINS technique using centromeric motifs constitutes an additional tool for cytogenetic studies in swine.  相似文献   

15.
A protocol for primed in situ DNA labeling (PRINS) was optimized for pea (Pisum sativum L.) and field bean (Vicia faba L.) chromosomes attached to coverslips. Cloned DNA or synthetic oligonucleotides were used as probes for repetitive DNA sequences (rDNA, Fok-element) and different reaction conditions were tested to achieve the highest specific signal-to-background ratio. A procedure based on direct labeling by fluorescein-dUTP was compared with an indirect one using digoxigenin detected by fluorescently labeled antibody. Under optimal conditions, strong and specific signals were obtained exclusively on chromosome regions known to contain respective DNA sequences. Compared to the direct labeling, significantly stronger signals were obtained when the indirect procedure was used. Both types of labeling were successfully applied to chromosomes in suspension and were shown to produce signals comparable to that obtained with chromosomes attached to coverslips. It is expected that primed in situ DNA labeling en suspension (PRINSES) will provide a basis for flow-cytometric discrimination and sorting of otherwise indistinguishable chromosomes according to their specific fluorescent labeling.  相似文献   

16.
Comparative genomic hybridization (CGH) was used to detect copy number changes of DNA sequences in the Ewing family of tumours (ET). We analysed 20 samples from 17 patients. Fifteen tumours (75%) showed copy number changes. Gains of DNA sequences were much more frequent than losses, the majority of the gains affecting whole chromosomes or whole chromosome arms. Recurrent findings included copy number increases for chromosomes 8 (seven out of 20 samples; 35%), 1q (five samples; 25%) and 12 (five samples; 25%). The minimal common regions of these gains were the whole chromosomes 8 and 12, and 1q21-22. High-level amplifications affected 8q13-24, 1q and 1q21-22, each once. Southern blot analysis of the specimen with high-level amplification at 1q21-22 showed an amplification of FLG and SPRR3, both mapped to this region. All cases with a gain of chromosome 12 simultaneously showed a gain of chromosome 8. Comparison of CGH findings with cytogenetic analysis of the same tumours and previous cytogenetic reports of ET showed, in general, concordant results. In conclusion, our findings confirm that secondary changes, which may have prognostic significance in ET, are trisomy 8, trisomy 12 and a gain of DNA sequences in 1q.  相似文献   

17.
Comparative chromosome G-/R-banding, comparative gene mapping and chromosome painting techniques have demonstrated that only few chromosomal rearrangements occurred during great ape and human evolution. Interspecies comparative genome hybridization (CGH), used here in this study, between human, gorilla and pygmy chimpanzee revealed species-specific regions in all three species. In contrast to the human, a far more complex distribution of species-specific blocks was detected with CGH in gorilla and pygmy chimpanzee. Most of these blocks coincide with already described heterochromatic regions on gorilla and chimpanzee chromosomes. Representational difference analysis (RDA) was used to subtract the complex genome of gorilla against human in order to enrich gorilla-specific DNA sequences. Gorilla-specific clones isolated with this technique revealed a 32-bp repeat unit. These clones were mapped by fluorescence in situ hybridization (FISH) to the telomeric regions of gorilla chromosomes that had been shown by interspecies CGH to contain species-specific sequences.  相似文献   

18.
Using an original computer program we analysed complete nucleotide sequences of chromosomes I, II, III, VI and IX in yeast cells. As a general rule, we found large stereospecific anomalies near genes with a presumed high expression level (a full catalogue of such anomalies for 5 genes with highest CAI in each chromosome is presented). As a rule, they are also present at mobile genetic elements. Many large stereospecific anomalies are situated next to the sites of specific anomalies of general nucleotide composition-regions devoid of specific dinucleotides. We have noticed many "trains" (lines) of different stereospecific anomalies, possibly showing areas of cooperative binding of different regulatory and structural proteins to DNA. In several, but not all, analysed chromosomes we found a new class of especially large stereospecific anomalies related to repetitive DNA of small length (less than or around 100 nucleotides).  相似文献   

19.
20.
In this report we address the problem of accurate statistical modeling of DNA sequences, either coding or noncoding, for a bacterial species whose genome (or a large portion) was sequenced but not yet characterized experimentally. Availability of these models is critical for successful solution of the genome annotation task by statistical methods of gene finding. We present the method, GeneMark-Genesis, which learns the parameters of Markov models of protein-coding and noncoding regions from anonymous bacterial genomic sequence. These models are subsequently used in the GeneMark and GeneMark.hmm gene-finding programs. Although there is basically one model of a noncoding region for a given genome, several models of protein-coding region are automatically obtained by GeneMark-Genesis. The diversity of protein-coding models reflects the diversity of oligonucleotide compositions, particularly the diversity of codon usage strategies observed in genes from one and the same genome. In the simplest and the most important case, there are just two gene models-typical and atypical ones. We show that the atypical model allows one to predict genes that escape identification by the typical model. Many genes predicted by the atypical model appear to be horizontally transferred genes. The early versions of GeneMark-Genesis were used for annotating the genomes of Methanoccocus jannaschii and Helicobacter pylori. We report the results of accuracy testing of the full-scale version of GeneMark-Genesis on 10 completely sequenced bacterial genomes. Interestingly, the GeneMark.hmm program that employed the typical and atypical models defined by GeneMark-Genesis was able to predict 683 new atypical genes with 176 of them confirmed by similarity search.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号