首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
A new algorithm is reported which builds an alignment between two protein structures. The algorithm involves a combinatorial extension (CE) of an alignment path defined by aligned fragment pairs (AFPs) rather than the more conventional techniques using dynamic programming and Monte Carlo optimization. AFPs, as the name suggests, are pairs of fragments, one from each protein, which confer structure similarity. AFPs are based on local geometry, rather than global features such as orientation of secondary structures and overall topology. Combinations of AFPs that represent possible continuous alignment paths are selectively extended or discarded thereby leading to a single optimal alignment. The algorithm is fast and accurate in finding an optimal structure alignment and hence suitable for database scanning and detailed analysis of large protein families. The method has been tested and compared with results from Dali and VAST using a representative sample of similar structures. Several new structural similarities not detected by these other methods are reported. Specific one-on-one alignments and searches against all structures as found in the Protein Data Bank (PDB) can be performed via the Web at http://cl.sdsc.edu/ce.html.  相似文献   

2.
We present a fully automatic structural classification of supersecondary structure units, consisting of two hydrogen-bonded beta strands, preceded or followed by an alpha helix. The classification is performed on the spatial arrangement of the secondary structure elements, irrespective of the length and conformation of the intervening loops. The similarity of the arrangements is estimated by a structure alignment procedure that uses as similarity measure the root mean square deviation of superimposed backbone atoms. Applied to a set of 141 well-resolved nonhomologous protein structures, the classification yields 11 families of recurrent arrangements. In addition, fragments that are structurally intermediate between the families are found; they reveal the continuity of the classification. The analysis of the families shows that the alpha helix and beta hairpin axes can adopt virtually all relative orientations, with, however, some preferable orientations; moreover, according to the orientation, preferences in the left/right handedness of the alpha-beta connection are observed. These preferences can be explained by favorable side by side packing of the alpha helix and the beta hairpin, local interactions in the region of the alpha-beta connection or stabilizing environments in the parent protein. Furthermore, fold recognition procedures and structure prediction algorithms coupled to database-derived potentials suggest that the preferable nature of these arrangements does not imply their intrinsic stability. They usually accommodate a large number of sequences, of which only a subset is predicted to stabilize the motif. The motifs predicted as stable could correspond to nuclei formed at the very beginning of the folding process.  相似文献   

3.
CATH--a hierarchic classification of protein domain structures   总被引:1,自引:0,他引:1  
BACKGROUND: Protein evolution gives rise to families of structurally related proteins, within which sequence identities can be extremely low. As a result, structure-based classifications can be effective at identifying unanticipated relationships in known structures and in optimal cases function can also be assigned. The ever increasing number of known protein structures is too large to classify all proteins manually, therefore, automatic methods are needed for fast evaluation of protein structures. RESULTS: We present a semi-automatic procedure for deriving a novel hierarchical classification of protein domain structures (CATH). The four main levels of our classification are protein class (C), architecture (A), topology (T) and homologous superfamily (H). Class is the simplest level, and it essentially describes the secondary structure composition of each domain. In contrast, architecture summarises the shape revealed by the orientations of the secondary structure units, such as barrels and sandwiches. At the topology level, sequential connectivity is considered, such that members of the same architecture might have quite different topologies. When structures belonging to the same T-level have suitably high similarities combined with similar functions, the proteins are assumed to be evolutionarily related and put into the same homologous superfamily. CONCLUSIONS: Analysis of the structural families generated by CATH reveals the prominent features of protein structure space. We find that nearly a third of the homologous superfamilies (H-levels) belong to ten major T-levels, which we call superfolds, and furthermore that nearly two-thirds of these H-levels cluster into nine simple architectures. A database of well-characterised protein structure families, such as CATH, will facilitate the assignment of structure-function/evolution relationships to both known and newly determined protein structures.  相似文献   

4.
Knowledge-based homology modelling together with site-directed mutagenesis, epitope and conformational mapping is an approach to predict the structures of proteins and for the rational design of new drugs. In this study we present how this procedure has been applied to model the structure of herpes simplex virus type 1 thymidine kinase (HSV1 TK, HSV1 ATP-thymidine-5'-phosphotransferase, EC 2.7.1.21). We have used, and evaluated, several secondary structure prediction methods, such as the classical one based on Chou and Fastman algorithm, neural networks using the Kabsch and Sander classification, and the PRISM method. We have validated the algorithms by applying them to the porcine adenylate kinase (ADK), whose three-dimensional structure is known and that has been used for the alignment of the TKs as well. The resulting first model of HSV1-TK consisted of the first beta-strand connected to the phosphate binding loop and its subsequent alpha-helix, the fourth beta-strand connected to the conserved FDRH sequence and two alpha-helix with basic amino acids. The 3D structure was built using the X-ray structure of ADK as template and following the general procedure for homology modelling. We extended the model by means of COMPOSER, an automatic process for protein modelling. Site-directed mutagenesis was used to experimentally verify the predicted active-site model of HSV1-TK. The data measured in our lab and by others support the suggestion that the FDRH motif is part of the active site and plays an important role in the phosphorylation of substrates. The structure of HSV1 TK, recently solved in collaboration with Prof. G. Schulz at 2.7 A resolution, includes 284 of 343 residues of the N-terminal truncated TK. The secondary structures could be clearly assigned and fitted to the density. The comparison between crystallographically determined structure and the model shows that nearly 70% of the HSV1 TK structure has been correctly modelled by the described integrated approach to knowledge based ligand protein complex structure prediction. This indicate that computer assisted methods, combined with "manual" correction both for alignment and 3D construction are useful and can be successful.  相似文献   

5.
6.
The four recognized levels of organization of protein structure (primary through quaternary) are extended to add the designation quinary structure for the interactions within helical arrays, such as found for sickle cell hemoglobin fibers or tubulin units in microtubules. For sickle cell hemoglobin the main quinary structure is a 14-filament fiber, with a number of other minor forms also encountered. Degenerate forms of the 14-filament fibers can be characterized that lack specific pairs of filaments; evidence is presented which suggests an overall organization of the 14 filaments in pairs, with particular pairs aligned in an antiparallel orientation. For tubulin, a range of quinary structures can be detected depending on the number of protofilaments and whether adjacent protofilaments composed of alternating alpha- and beta-subunits are aligned with contacts between like or unlike subunits and with parallel or antiparallel polarity. Thus, in contrast to quarternary structure, which generally involves a fixed number of subunits, the quinary structures of proteins can exhibit marked plasticity and inequivalence in the juxtaposition of constituent molecules.  相似文献   

7.
We report the latest release (version 1.4) of the CATH protein domains database (http://www.biochem.ucl.ac.uk/bsm/cath). This is a hierarchical classification of 13 359 protein domain structures into evolutionary families and structural groupings. We currently identify 827 homologous families in which the proteins have both structual similarity and sequence and/or functional similarity. These can be further clustered into 593 fold groups and 32 distinct architectures. Using our structural classification and associated data on protein functions, stored in the database (EC identifiers, SWISS-PROT keywords and information from the Enzyme database and literature) we have been able to analyse the correlation between the 3D structure and function. More than 96% of folds in the PDB are associated with a single homologous family. However, within the superfolds, three or more different functions are observed. Considering enzyme functions, more than 95% of clearly homologous families exhibit either single or closely related functions, as demonstrated by the EC identifiers of their relatives. Our analysis supports the view that determining structures, for example as part of a 'structural genomics' initiative, will make a major contribution to interpreting genome data.  相似文献   

8.
A family of structurally related intrinsic membrane proteins (facilitative glucose transporters) catalyzes the movement of glucose across the plasma membrane of animal cells. Evidence indicates that these proteins show a common structural motif where approximately 50% of the mass is embedded in lipid bilayer (transmembrane domain) in 12 alpha-helices (transmembrane helices; TMHs) and accommodates a water-filled channel for substrate passage (glucose channel) whose tertiary structure is currently unknown. Using recent advances in protein structure prediction algorithms we proposed here two three-dimensional structural models for the transmembrane glucose channel of GLUT1 glucose transporter. Our models emphasize the physical dimension and water accessibility of the channel, loop lengths between TMHs, the macrodipole orientation in four-helix bundle motif, and helix packing energy. Our models predict that five TMHs, either TMHs 3, 4, 7, 8, 11 (Model 1) or TMHs 2, 5, 11, 8, 7 (Model 2), line the channel, and the remaining TMHs surround these channel-lining TMHs. We discuss how our models are compatible with the experimental data obtained with this protein, and how they can be used in designing new biochemical and molecular biological experiments in elucidation of the structural basis of this important protein function.  相似文献   

9.
MOTIVATION: The automatic alignment of rRNA sequences can reproduce manual expert alignments with high, but not perfect, fidelity. We examine the use of empirical methods for the identification of regions of an alignment of a new sequence with an existing large alignment which can confidently be predicted to be correctly aligned. RESULTS: We show how to use a simple jack-knife procedure to derive an estimate of the reliability that is to be expected at each position of a large alignment of eukaryotic rRNA sequences. These reliabilities are then improved using measures that are specific to the input sequence. Regions where the sequence-specific reliability method performs particularly well are identified and seen to correspond with elements in the structure of the rRNA molecules that vary between species in the alignment. We also compare these reliability measures to an algorithmic alignment stability measure. AVAILABILITY: The software is available free of charge by sending an e-mail message to emmet@chah.ucc.ie. CONTACT: emmet@chah.ucc.ie  相似文献   

10.
We have developed a light scattering technique that can be used to analyze the orientation and diameter of collagen fibers in histologic sections of connective tissue. Scattering patterns obtained by transmitting laser light through sections of tissue contain information both on the orientation, degree of alignment, and size of the constituent collagen fibers. Analysis of the azimuthal intensity distribution of scattered light yields numerical values of the degree of alignment by use of an orientation index, S, which is chosen to vary between 0 for randomly oriented fibers and 1 for a perfectly aligned arrangement. The average diameter of the collagen fibers is calculated from the scattering angle at which the intensity reaches its first minimum. These measurements are independent of the nature of histologic stain. The procedure is illustrated by measurements obtained with sections of the guinea pig dermis and of control scar. We conclude from our experiments that light scattering can complement the analysis of tissue architecture typically performed with the light microscope.  相似文献   

11.
A structure-based scoring matrix MDPRE was derived from amino acid spatial preferences in protein structures. Sequence alignment and evolutionary studies by using MDPRE matrix gave similar results as those from ordinary sequence and structure alignments. It is interesting that a matrix derived from structure data solely could give comparable alignment results, strongly indicating the intimate connection between protein sequences and structures. The branch order and length from this approach were close to those obtained by a structure comparison method. Thus, by applying this structure-based matrix, the trees obtained should reflect evolutionary characteristics of protein structure. This approach takes advantage over a direct structure comparison in that (1) only a sequence and MDPRE matrix are needed, making it simple and widely applicable (especially in the absence of 3-dimensional protein structure data); (2) an established algorithm for sequence alignment and tree building could be employed, providing opportunities for direct comparison between matrices from different methodologies. One of the most striking features of this method is its capability to detect protein structure homologies when the sequence identities are low. This was well reflected in the given examples of the alignment of dinucleotide-binding domains.  相似文献   

12.
Some computer applications for tissue characterization in medicine and biology, such as analysis of the myocardium or cancer recognition, operate with tissue samples taken from very small areas of interest. In order to perform texture characterization in such an application, only a few texture operators can be employed: the operators should be insensitive to noise and image distortion and yet be reliable in order to estimate texture quality from the small number of image points available. In order to describe the quality of infarcted myocardial tissue, we propose a new wavelet-based approach for analysis and classification of texture samples with small dimensions. The main idea of this method is to decompose the given image with a filter bank derived from an orthonormal wavelet basis and to form an image approximation with higher resolution. Texture energy measures calculated at each output of the filter bank as well as energies of synthesized images are used as texture features in a classification procedure. We propose an unsupervised classification technique based on a modified statistical t-test. The method is tested with clinical data, and the classification results obtained are very promising. The performance of the new method is compared with the performance of several other transform-based methods. The new algorithm has advantages in classification of small and noisy input samples, and it represents a step toward structural analysis of weak textures.  相似文献   

13.
14.
If structural knowledge of a receptor under consideration is lacking, drug design approaches focus on similarity or dissimilarity analysis of putative ligands. In this context the mutual ligand superposition is of utmost importance. Methods that are rapid enough to facilitate interactive usage, that allow to process sets of conformers and that enable database screening are of special interest here. The ability to superpose molecular fragments instead of entire molecules has proven to be helpful too. The RIGFIT approach meets these requirements and has several additional advantages. In three distinct test applications, we evaluated how closely we can approximate the observed relative orientation for a set of known crystal structures, we employed RIGFIT as a fragment placement procedure, and we performed a fragment-based database screening. The run time of RIGFIT can be traded off against its accuracy. To be competitive in accuracy with another state-of-the-art alignment tool, with which we compare our method explicitly, computing times of about 6 s per superposition on a common day workstation are required. If longer run times can be afforded the accuracy increases significantly. RIGFIT is part of the flexible superposition software FLEXS which can be accessed on the WWW [http:/(/)cartan.gmd.de/FlexS].  相似文献   

15.
Knowledge-based protein secondary structure assignment   总被引:1,自引:0,他引:1  
We have developed an automatic algorithm STRIDE for protein secondary structure assignment from atomic coordinates based on the combined use of hydrogen bond energy and statistically derived backbone torsional angle information. Parameters of the pattern recognition procedure were optimized using designations provided by the crystallographers as a standard-of-truth. Comparison to the currently most widely used technique DSSP by Kabsch and Sander (Biopolymers 22:2577-2637, 1983) shows that STRIDE and DSSP assign secondary structural states in 58 and 31% of 226 protein chains in our data sample, respectively, in greater agreement with the specific residue-by-residue definitions provided by the discoverers of the structures while in 11% of the chains, the assignments are the same. STRIDE delineates every 11th helix and every 32nd strand more in accord with published assignments.  相似文献   

16.
The multiple sequence alignment problem is applicable and important in various fields in molecular biology such as the prediction of three-dimensional structures of proteins and the inference of phylogenetic trees. However, the optimal alignment based on the scoring criterion is not always biologically the most significant alignment. We here propose two flexible and efficient approaches to solve this problem. One approach is to provide many suboptimal alignments as alternatives for the optimal one. It has been considered almost impossible to investigate such suboptimal alignments of more than two sequences because of the enormous size of the problem. We propose techniques for enumeration of suboptimal alignments using the Eppstein algorithm. We also discuss what kind of suboptimal alignment is unnecessary to enumerate and propose an efficient enumeration algorithm to enumerate only necessary alignments. The other approach is parametric analysis. The obtained optimal solution with fixed parameters such as gap penalties is not always the biologically best alignment. Thus, it is required to vary parameters and check how the optimal alignments change. The way to vary parameters has been studied well on the problem of two sequences, but not on the multiple alignment problem because of the difficulty of computing the optimal solution. We propose techniques for this parametric multiple alignment problem and examine the features of alignments obtained by various parametric analyses. For both approaches, this paper performs experiments on various groups of actual protein sequences and examines the efficiency of these algorithms and properties of sequence groups.  相似文献   

17.
We describe a novel application of a fragment-based ligand docking technique; similar methods are commonly applied to the de novo design of ligands for target protein binding sites. We have used several new flexible docking and superposition tools, as well as a more conventional rigid-body (fragment) docking method, to examine NAD binding to the catalytic subunits of diphtheria (DT) and pertussis (PT) toxins, and to propose a model of the NAD-PT complex. Docking simulations with the rigid NAD fragments adenine and nicotinamide revealed that the low-energy dockings clustered in three distinct sites on the two proteins. Two of the sites were common to both fragments and were related to the structure of NAD bound to DT in an obvious way; however, the adenine subsite of PT was shifted relative to that of DT. We chose adenine/nicotinamide pairs of PT dockings from these clusters and flexibly superimposed NAD onto these pairs. A Monte Carlo-based flexible docking procedure and energy minimization were used to refine the modeled NAD-PT complexes. The modeled complex accounts for the sequence and structural similarities between PT and DT and is consistent with many results that suggest the catalytic importance of certain residues. A possible functional role for the structural difference between the two complexes is discussed.  相似文献   

18.
A simple method is presented for projecting the conformation of extended secondary structure elements of peptides and proteins that extend over four C alpha atoms onto a simple two-dimensional surface. A new set of two degrees of freedom is defined, a pseudodihedral involving four sequential C alpha atoms, as well as the triple scalar product for the vectors describing the orientation of the three intervening peptide groups. The method provides a reduction in dimensionality, from the usual combination of multiple phi,psi pairs to a single pair, yielding valuable information concerning the structure and dynamics of these important elements. The new two-dimensional surface is explored by reference to 63 selected protein crystal structures together with a comparison of model built peptides representing the common secondary structural elements. Dynamical aspects on this new surface are examined using a molecular dynamics trajectory of Basic Pancreatic Trypsin Inhibitor.  相似文献   

19.
A quantitative procedure is described for the comparison of secondary structure of homologous proteins. Standard predictive methods are used to generate probability profiles from pairs of homologous amino acid sequences; correlation coefficients (R) are then computed between each pair of amino acids for alpha-helix (R alpha), extended structure (R beta), turn (R(t)), and coil (R(c)). R values are >0.2 for correctly aligned homologous sequences. Unrelated or incorrectly aligned sequences give R values near zero. Lack of correlation for a segment of otherwise well-correlated sequences is used to identify structural divergence, which is then evaluated graphically by using difference profiles. A combination of these techniques correctly predicts secondary structural differences between melittin or beta-endorphin and their respective synthetic analogs. The method is potentially useful to describe evolutionary changes in protein secondary structure as well as in the design of peptide analogs.  相似文献   

20.
The novel three dimensional structure(3D)-amino acid sequence(1D) compatibility program, 3d-1D method has developed and expanded into protein-nucleic acid system. The environment characteristics are determined using the coordinates of backbone atoms and C beta atom. This simplified estimation allows the expanded application to monitor the change of environment class depending on different sequence alignment. This method has improved the detective performance of the different protein into the similar folds. If protein has a wide contact with nucleic acid molecule, the compatibility profile goes down in the contact region without ligand molecule. This phenomenon becomes useful to monitor the binding sites of the bulky ligand such as nucleic acid and to model the protein-nucleic acid complex.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号