首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: JOY is a program to annotate protein sequence alignments with three-dimensional (3D) structural features. It was developed to display 3D structural information in a sequence alignment and to help understand the conservation of amino acids in their specific local environments. RESULTS:: The JOY representation now constitutes an essential part of the two databases of protein structure alignments: HOMSTRAD (http://www-cryst.bioc.cam.ac.uk/homstrad ) and CAMPASS (http://www-cryst.bioc.cam.ac. uk/campass). It has also been successfully used for identifying distant evolutionary relationships. AVAILABILITY: The program can be obtained via anonymous ftp from torsa.bioc.cam.ac.uk from the directory /pub/joy/. The address for the JOY server is http://www-cryst.bioc.cam.ac.uk/cgi-bin/joy.cgi. CONTACT: kenji@cryst.bioc.cam.ac.uk  相似文献   

2.
JPred: a consensus secondary structure prediction server   总被引:1,自引:0,他引:1  
An interactive protein secondary structure prediction Internet server is presented. The server allows a single sequence or multiple alignment to be submitted, and returns predictions from six secondary structure prediction algorithms that exploit evolutionary information from multiple sequences. A consensus prediction is also returned which improves the average Q3 accuracy of prediction by 1% to 72.9%. The server simplifies the use of current prediction algorithms and allows conservation patterns important to structure and function to be identified. AVAILABILITY: http://barton.ebi.ac.uk/servers/jpred.h tml CONTACT: geoff@ebi.ac.uk  相似文献   

3.
MView is a tool for converting the results of a sequence database search into the form of a coloured multiple alignment of hits stacked against the query. Alternatively, an existing multiple alignment can be processed. In either case, the output is simply HTML, so the result is platform independent and does not require a separate application or applet to be loaded. AVAILABILITY: Free from http://www.sander.ebi.ac.uk/mview/ subject to copyright restrictions. CONTACT: brown@ebi.ac.uk  相似文献   

4.
I describe the current version of the sequence analysis package developed at the MRC Laboratory of Molecular Biology, which has come to be known as the "Staden Package." The package covers most of the standard sequence analysis tasks such as restriction site searching, translation, pattern searching, comparison, gene finding, and secondary structure prediction, and provides powerful tools for DNA sequence determination. Currently the programs are only available for computers running the UNIX operating system. Detailed information about the package is available from our WWW site: http:@www.mrc-lmb.cam.ac.uk/pubseq/.  相似文献   

5.
We describe a database of protein structure alignments for homologous families. The database HOMSTRAD presently contains 130 protein families and 590 aligned structures, which have been selected on the basis of quality of the X-ray analysis and accuracy of the structure. For each family, the database provides a structure-based alignment derived using COMPARER and annotated with JOY in a special format that represents the local structural environment of each amino acid residue. HOMSTRAD also provides a set of superposed atomic coordinates obtained using MNYFIT, which can be viewed with a graphical user interface or used for comparative modeling studies. The database is freely available on the World Wide Web at: http://www-cryst.bioc.cam. ac.uk/-homstrad/, with search facilities and links to other databases.  相似文献   

6.
CINEMA is a new editor for manipulating and generating multiple sequence alignments. The program provides both an interface to existing databases of alignments on the Internet and a tool for constructing and modifying alignments locally. It is written in Java, so executable code will run on most major desktop platforms without modification. The implementation is highly flexible, so the applet can be easily customised with additional functions; and the object classes are reusable, promoting rapid development of program extensions. Formerly, such extended functionality might have been provided via browser plug-ins, which have to be downloaded and installed on every client before loading data. Now, for the first time, an applet is available that allows interactive client-side processing of an alignment, which can then be stored or processed automatically on the server. The program is embedded in a comprehensive help file and is accessible both as a stand-alone tool on UCL's Bioinformatics Server; http:/(/)www.biochem.ucl.ac.uk/bsm/dbbrowser+ ++/CINEMA2.02/, and as an integral part of the PRINTS protein fingerprint database. Exploitation of such novel technologies revolutionises the way users may interact with public databases in the future: bioinformatics centres need not simply provide data, but are now able to offer the means by which information is visualised and manipulated, without the requirement for users to install software.  相似文献   

7.
A genetic algorithm for multiple molecular sequence alignment   总被引:1,自引:0,他引:1  
MOTIVATION: Multiple molecular sequence alignment is among the most important and most challenging tasks in computational biology. The currently used alignment techniques are characterized by great computational complexity, which prevents their wider use. This research is aimed at developing a new technique for efficient multiple sequence alignment. APPROACH: The new method is based on genetic algorithms. Genetic algorithms are stochastic approaches for efficient and robust searching. By converting biomolecular sequence alignment into a problem of searching for optimal or near-optimal points in an 'alignment space', a genetic algorithm can be used to find good alignments very efficiently. RESULTS: Experiments on real data sets have shown that the average computing time of this technique may be two or three orders lower than that of a technique based on pairwise dynamic programming, while the alignment qualities are very similar. AVAILABILITY: A C program on UNIX has been written to implement the technique. It is available on request from the authors.  相似文献   

8.
MOTIVATION: A modified Sherman statistic can be used to test whether the differences between two aligned sequences are distributed at random along the sequences, or whether they are clustered, which suggests anomalies of evolution such as partial gene recombination or functional constraints. The presence of evenly spaced constant sites (such as constancy at the second codon position in genes coding for proteins) lowers the statistic and makes the significance less than it should be. RESULTS: The magnitude of the constant-site effect is shown by simulation to depend mainly on the proportion of differences between two sequences and on the number of constant sites that are added after each variable site. This latter number can be estimated from the variance of sites in a sequence matrix at the first, second and third codon positions, to obtain a ratio that corrects the statistic. When expressed as standard errors, the uncorrected results are too low (typically half to one unit when almost all the variation is at the third codon position). Correction raises the standard errors to levels close to expectation. If the data show no marked ternary periodicity, the correction is very small. The method is illustrated with biological data that show close to random behaviour, and with data that exhibit strong clustering. AVAILABILITY: The software is available from the author and has also been placed on the EMBL file server (Software@embl-ebi.ac.uk). CONTACT: phas1@le.ac.uk  相似文献   

9.
MOTIVATION: Evolutionary models of amino acid sequences can be adapted to incorporate structure information; protein structure biologists can use phylogenetic relationships among species to improve prediction accuracy. Results : A computer program called PASSML ('Phylogeny and Secondary Structure using Maximum Likelihood') has been developed to implement an evolutionary model that combines protein secondary structure and amino acid replacement. The model is related to that of Dayhoff and co-workers, but we distinguish eight categories of structural environment: alpha helix, beta sheet, turn and coil, each further classified according to solvent accessibility, i.e. buried or exposed. The model of sequence evolution for each of the eight categories is a Markov process with discrete states in continuous time, and the organization of structure along protein sequences is described by a hidden Markov model. This paper describes the PASSML software and illustrates how it allows both the reconstruction of phylogenies and prediction of secondary structure from aligned amino acid sequences. AVAILABILITY: PASSML 'ANSI C' source code and the example data sets described here are available at http://ng-dec1.gen.cam.ac.uk/hmm/Passml.html and 'downstream' Web pages. CONTACT: P.Lio@gen.cam.ac.uk  相似文献   

10.
11.
PRINTS is a compendium of protein motif fingerprints derived from the OWL composite sequence database. Fingerprints are groups of motifs within sequence alignments whose conserved nature allows them to be used as signatures of family membership. Fingerprints inherently offer improved diagnostic reliability over single motif methods by virtue of the mutual context provided by motif neighbors. To date, 650 fingerprints have been constructed and stored in PRINTS, the size of which has doubled in the last 2 years. The current version, 14.0, encodes 3500 motifs, covering a range of globular and membrane proteins, modular polypeptides, and so on. The database is now accessible via the UCL Bioinformatics Server on http:@ www.biochem.ucl.ac.uk/bsm/dbbrowser/. We describe here progress with the database, its compilation and interrogation software, and its Web interface.  相似文献   

12.
The sequences of related proteins can diverge beyond the point where their relationship can be recognised by pairwise sequence comparisons. In attempts to overcome this limitation, methods have been developed that use as a query, not a single sequence, but sets of related sequences or a representation of the characteristics shared by related sequences. Here we describe an assessment of three of these methods: the SAM-T98 implementation of a hidden Markov model procedure; PSI-BLAST; and the intermediate sequence search (ISS) procedure. We determined the extent to which these procedures can detect evolutionary relationships between the members of the sequence database PDBD40-J. This database, derived from the structural classification of proteins (SCOP), contains the sequences of proteins of known structure whose sequence identities with each other are 40% or less. The evolutionary relationships that exist between those that have low sequence identities were found by the examination of their structural details and, in many cases, their functional features. For nine false positive predictions out of a possible 432,680, i.e. at a false positive rate of about 1/50,000, SAM-T98 found 35% of the true homologous relationships in PDBD40-J, whilst PSI-BLAST found 30% and ISS found 25%. Overall, this is about twice the number of PDBD40-J relations that can be detected by the pairwise comparison procedures FASTA (17%) and GAP-BLAST (15%). For distantly related sequences in PDBD40-J, those pairs whose sequence identity is less than 30%, SAM-T98 and PSI-BLAST detect three times the number of relationships found by the pairwise methods.  相似文献   

13.
RESULTS: This paper describes a new program which reveals, analyses and graphically represents patterns of variability along nucleotide sequences. AVAILABILITY: The program, 'SWAN', is available from the WWW at http://evolve.zoo.ox.ac.uk/ or from the authors upon request. CONTACT: Vitali.Proutski@zoology.oxford.ac.uk  相似文献   

14.
The Protein Information Resource (PIR) has been maintaining a database of curated protein sequence alignments since 1991. The collection includes superfamily, family and homology domain alignments. CLUSTAL V/W is used to generate multiple sequence alignments and ALNED, an interactive alignment editor, is used to check and correct them. The database has helped in classifying sequences, in defining new homology domains, and in spreading and standardizing protein names, features and keywords among members of a family or superfamily. The ATLAS information retrieval system can be used to browse and query the PIR-ALN alignments. The quarterly and weekly updates can be accessed via the WWW at http://www-nbrf. georgetown.edu/pir/  相似文献   

15.
Previously proposed methods for protein secondary structure prediction from multiple sequence alignments do not efficiently extract the evolutionary information that these alignments contain. The predictions of these methods are less accurate than they could be, because of their failure to consider explicitly the phylogenetic tree that relates aligned protein sequences. As an alternative, we present a hidden Markov model approach to secondary structure prediction that more fully uses the evolutionary information contained in protein sequence alignments. A representative example is presented, and three experiments are performed that illustrate how the appropriate representation of evolutionary relatedness can improve inferences. We explain why similar improvement can be expected in other secondary structure prediction methods and indeed any comparative sequence analysis method.  相似文献   

16.
The multiple sequence alignment problem is applicable and important in various fields in molecular biology such as the prediction of three-dimensional structures of proteins and the inference of phylogenetic trees. However, the optimal alignment based on the scoring criterion is not always biologically the most significant alignment. We here propose two flexible and efficient approaches to solve this problem. One approach is to provide many suboptimal alignments as alternatives for the optimal one. It has been considered almost impossible to investigate such suboptimal alignments of more than two sequences because of the enormous size of the problem. We propose techniques for enumeration of suboptimal alignments using the Eppstein algorithm. We also discuss what kind of suboptimal alignment is unnecessary to enumerate and propose an efficient enumeration algorithm to enumerate only necessary alignments. The other approach is parametric analysis. The obtained optimal solution with fixed parameters such as gap penalties is not always the biologically best alignment. Thus, it is required to vary parameters and check how the optimal alignments change. The way to vary parameters has been studied well on the problem of two sequences, but not on the multiple alignment problem because of the difficulty of computing the optimal solution. We propose techniques for this parametric multiple alignment problem and examine the features of alignments obtained by various parametric analyses. For both approaches, this paper performs experiments on various groups of actual protein sequences and examines the efficiency of these algorithms and properties of sequence groups.  相似文献   

17.
PRINTS is a database of protein family 'fingerprints' offering a diagnostic resource for newly-determined sequences. By contrast with PROSITE, which uses single consensus expressions to characterise particular families, PRINTS exploits groups of motifs to build characteristic signatures. These signatures offer improved diagnostic reliability by virtue of the mutual context provided by motif neighbours. To date, 800 fingerprints have been constructed and stored in PRINTS. The current version, 17.0, encodes approximately 4500 motifs, covering a range of globular and membrane proteins, modular polypeptides, and so on. The database is accessible via the UCL Bioinformatics World Wide Web (WWW) Server at http://www. biochem.ucl.ac.uk/bsm/dbbrowser/ . We have recently enhanced the usefulness of PRINTS by making available new, intuitive search software. This allows both individual query sequence and bulk data submission, permitting easy analysis of single sequences or complete genomes. Preliminary results indicate that use of the PRINTS system is able to assign additional functions not found by other methods, and hence offers a useful adjunct to current genome analysis protocols.  相似文献   

18.
Currently the protein mutant database (PMD) contains over 81 000 mutants, including artificial as well as natural mutants of various proteins extracted from about 10 000 articles. We recently developed a powerful viewing and retrieving system (http://pmd.ddbj.nig.ac.jp), which is integrated with the sequence and tertiary structure databases. The system has the following features: (i) mutated sequences are displayed after being automatically generated from the information described in the entry together with the sequence data of wild-type proteins integrated. This is a convenient feature because it allows one to see the position of altered amino acids (shown in a different color) in the entire sequence of a wild-type protein; (ii) for those proteins whose 3D structures have been experimentally determined, a 3D structure is displayed to show mutation sites in a different color; (iii) a sequence homology search against PMD can be carried out with any query sequence; (iv) a summary of mutations of homologous sequences can be displayed, which shows all the mutations at a certain site of a protein, recorded throughout the PMD.  相似文献   

19.
20.
The recently described equivalence between the alignment of two proteins and a conformation of a lattice chain on a two-dimensional square lattice is extended to multiple alignments. The search for the optimal multiple alignment between several proteins, which is equivalent to finding the energy minimum in the conformational space of a multi-dimensional lattice chain, is studied by the Monte Carlo approach. This method, while not deterministic, and for two-dimensional problems slower than dynamic programming, can accept arbitrary scoring functions, including non-local ones, and its speed decreases slowly with increasing number of dimensions. For the local scoring functions, the MC algorithm can also reproduce known exact solutions for the direct multiple alignments. As illustrated by examples, both for structure- and sequence-based alignments, direct multi-dimensional alignments are able to capture weak similarities between divergent families much better than ones built from pairwise alignments by a hierarchical approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号