首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Designing amino acid sequences to fold with good hydrophobic cores   总被引:3,自引:0,他引:3  
We present two methods for designing amino acid sequences ofproteins that will fold to have good hydrophobic cores. Giventhe coordinates of the desired target protein or polymer structure,the methods generate sequences of hydrophobic (H) and polar(P) monomers that are intended to fold to these structures.One method designs hydrophobic inside, polar outside; the otherminimizes an energy function in a sequence evolution process.The sequences generated by these methods agree at the levelof 60–80% of the sequence positions in 20 proteins inthe Protein Data Bank. A major challenge in protein design isto create sequences that can fold uniquely, i.e. to a singleconformation rather than to many. While an earlier lattice-basedsequence evolution method was shown not to design unique folders,our method generates unique folders in lattice model tests.These methods may also be useful in designing other types offoldable polymer not based on amino acids  相似文献   

2.
In search of the ideal protein sequence   总被引:1,自引:0,他引:1  
The inverse of a folding problem is to find the ideal sequencethat folds into a particular protein structure. This problemhas been addressed using the topology fingerprintbased threadingalgorithm, capable of calculating a score (energy) of an arbitrarysequence-structure pair. At first, the search is conducted byunconstrained minimization of the energy in sequence space.It is shown that using energy as the only design criterion leadsto spurious solutions with incorrect amino acid composition.The problem lies in the general features of the protein energysurface as a function of both structure and sequence. The proposedsolution is to design the sequence by maximizing the differencebetween its energy in the desired structure and in other knownprotein structures. Depending on the size of the database ofstructures ‘to avoid’, sequences bearing significantsimilarity to the native sequence of the target protein areobtained using this procedure.  相似文献   

3.
Recent models suggest that the mechanism of protein folding is determined by the balance between the stability of secondary structural elements and the hydrophobicity of the sequence. Here we determine the role of these factors in the folding kinetics of Im9* by altering the secondary structure propensity or hydrophobicity of helices I, II or IV by the substitution of residues at solvent exposed sites. The folding kinetics of each variant were measured at pH 7.0 and 10 degrees C, under which conditions wild-type Im9* folds with two-state kinetics. We show that increasing the helicity of these sequences in regions known to be structured in the folding intermediate of Im7*, switches the folding of Im9* from a two- to three-state mechanism. By contrast, increasing the hydrophobicity of helices I or IV has no effect on the kinetic folding mechanism. Interestingly, however, increasing the hydrophobicity of solvent-exposed residues in helix II stabilizes the folding intermediate and the rate-limiting transition state, consistent with the view that this helix makes significant non-native interactions during folding. The results highlight the generic importance of intermediates in folding and show that such species can be populated by increasing helical propensity or by stabilizing inter-helix contacts through non-native interactions.  相似文献   

4.
Immunoglobulin (Ig)-like proteins have been shown to fold following formation of a nucleus comprising interactions between residues that are distant in the primary sequence. What role do the loops connecting these nucleus residues play? Here, the importance of loops connecting beta-strands in different sheets of the Ig fold is investigated, by insertion of five glycine residues into the B-C loop of an Ig domain from human titin, TI I27. The folding pathway of this elongated 'pseudo wild-type' TI I27 is probed using protein engineering and Phi-value analysis. The Phi-values calculated for mutants within the pseudo wild-type protein indicate that the folding nucleus in wild-type TI I27 is conserved, supporting the hypothesis that the inter-sheet loop is not critical to the formation of a long-range folding nucleus.  相似文献   

5.
An investigation of protein subunit and domain interfaces   总被引:2,自引:0,他引:2  
Protein structures were collected from the Brookhaven Databaseof tertiary architectures that displayed oligomeric association(24 molecules) or whose polypeptide folding revealed domains(34 proteins). The subunit and domain interfaces for these proteinswere respectively examined from the following aspects: percentagewater-accessible surface area buried by the respective associations,surface compositions and physical characteristics of the residuesinvolved in the subunit and domain contacts, secondary structuralstate of the interface amino acids, preferred polar and non-polarinteractions, spatial distribution of polar and non-polar residueson the interface surface, same residue interactions in the oligomeric:contacts, and overall cross-section and shape of the contactsurfaces. A general, consistent picture emerged for both thedomain and subunit interfaces.  相似文献   

6.
A total of 23 fungal cellulose-binding domain (CBD) sequenceswere aligned. Structural models of the cellulosebinding domainof an exoglucanase (CBHII) and of three endoglucanases (EGI,EGII and EGV) from Trichoderma reesei cellulases were homologymodelled based on the NMR structure of the fungal cellobiohydrolaseCBHI, from the same organism. The completed models and the knownstructure of the CBHI cellulose-binding domain were refinedby molecular dynamics simulations in water. All four modelswere found to be very similar to the structure of the CBHI cellulose-bindingdomain and sequence comparison indicated that in general thethree-dimensional structures of fungal cellulose-binding domainsare very similar. In all the CBDs studied, two disulphide bridgesapparently stabilize the polypeptide fold. From the models,an additional disulphide bridge was predicted in EGI and CBHII,and in eight further CBDs from other organisms. Three highlyconserved aromatic residues on the hydrophilic side of the wedgemake this surface flat This surface is expected to make contactwith the substrate. Three invariant amino acids, Gln7, Asn29and Gln34, on this flat face are in suitable positions for hydrogenbonding with the cellulose surface. Analysis of the differencesin the protein surface properties indicated that the endoglucanasestend to be more hydrophilic than the exoglucanases. The largeststructural variation was found around positions 12-16. The fungalCBD sequences are discussed in relation to variations in functionand pH dependence. Comparison of the modelled structures withexperimental binding data for the CBHI and EGI allowed the formulationof a qualitative relationship to cellulose affinity. This relationshipwas used to predict the cellulose affinities for 21 CBDs.  相似文献   

7.
We have investigated the process of protein folding by Monte-Carlosimulation of folding occurring in a simple 3D lattice modelof a protein globule. We have found the range of ‘optimal’temperatures where the native fold is achieved by the Monte-Carloprocess much faster than that by exhaustive sorting of all thechain folds. The ‘optimal’ temperatures are essentiallythe same for different random and lsquo;edited’ sequences(for the latter, the native fold energy is separated by a considerablegap from the energies of other low-energy folds; for randomsequences, this gap is negligible). At the ‘optimal’temperatures, the ‘edited’ chains attain their nativefold faster than the random ones. However, the essence is thatthe native folds of ‘edited’ chains are thermodynamicallystable at temperatures optimal for fast folding, while the nativefolds of random chains are unstable at the temperatures optimalfor fast folding; also, at low temperatures where the nativefolds of random chains are stable, folding kinetics is veryslow. Consequently, stable native folds are formed slowly byrandom sequences and rapidly by the ‘edited’ ones  相似文献   

8.
A data bank merging related protein structures and sequences   总被引:1,自引:0,他引:1  
A data collection which merges protein structural and sequenceinformation is described. Structural superpositions amongstproteins with similar main-chain fold were performed or collectedfrom the literature. Sequences taken from the protein primarystructure databases were associated with the multiple structuralalignments providing they were at least 50% homologous in residueidentity to one of the structural sequences and at least 50%of the structural sequence residues were alignable. Such restrictionsallow reasonable confidence that the primary sequences sharethe conformation of the tertiary structural templates, exceptin the less conserved loop regions. Multiple structural superpositionswere collected for 38 familial groups containing a total of209 tertiary structures; 45 structures had no superposable matesand were used individually. Other information is also providedas main-chain and side-chain conformational angles, secondarystructural assignments and the like. Wedding the primary andtertiary structural data resulted in an 8-fold increase of databank sequence entries over those associated with the known three-dimensionalarchitectures alone.  相似文献   

9.
In the TNC family of Ca-binding proteins (calmodulin, parvalbumin,intestinal calcium binding protein and troponin C) {small tilde}70 well-conserved amino acid sequences and six crystal structuresare known. We find a clear correlation between residue contactsin the structures and residue conservation in the sequences:residues with strong sidechain–sidechain contacts in thethree-dimensional structure tend to be the more conserved inthe sequence. This is one way to quantify the intuitive notionof the importance of sidechain interactions for maintainingprotein three-dimensional structure in evolution and may usefullybe taken into account in planning point mutations in proteinengineering.  相似文献   

10.
The deletion of nine residues from the C-terminus of the bacterialchloramphenicol acetyltransferase (CAT) results in depositionof the mutant protein in cytoplasmic inclusion bodies and lossof chloramphenicol resistance in Escherichia coli. This foldingdefect is relieved by C-terminal fusion of the polypeptide withas few as two residues. Based on these observations, efficientpositive selection for the cloning of DNA fragments has beendemonstrated. The cloning vector encodes a C-terminally truncatedCAT protein. Restriction sites in front of the stop codon allowthe insertion of target DNA, resulting in the production ofproperly folded CAT fusion proteins and regained chloramphenicolresistance. The positive selection of recombinants is accomplishedby growth of transformants on chloramphenicol-containing agarplates. The method appears particularly convenient for the cloningof DNA fragments amplified by the PCR because minimal informationto restore CAT folding can be included in the primers. The cloningof random sequences shows that the folding defect can be relievedby fusion to a wide variety of peptides, providing great flexibilityto the positive selection system. This vector may also contributeto the determination of the role of the C-terminus in CAT folding.  相似文献   

11.
Evolutionarily conserved hydrophobic residues at the core of protein structures are generally assumed to play a structural role in protein folding and stability. Recent studies have implicated that their importance to protein structures is uneven, with a few of them being crucial and the rest of them being secondary. In this work, we explored the possibility of employing this feature of native structures for discriminating non-native structures from native ones. First, we developed a network tool to quantitatively measure the structural contributions of individual amino acid residues. We systematically applied this method to diverse fold-type sets of native proteins. It was confirmed that this method could grasp the essential structural features of native proteins. Next, we applied it to a number of decoy sets of proteins. The results indicate that such an approach indeed identified non-native structures in most test cases. This finding should be of help for the investigation of the fundamental problem of protein structure prediction.  相似文献   

12.
Analysis of protein conformational characteristics related to thermostability   总被引:11,自引:0,他引:11  
The thermal stability of proteins was studied, 195 single aminoacid residue replacements reported elsewhere being analysedfor several protein conformational characteristics: type ofresidue replacement; conservative versus nonconservative substitution;replacement being in a homologous stretch of amino acid residues;change in hydrogen bond, van der Waals and secondary structurepropensities; solvent-accessible versus inaccessible replacement;type of secondary structure involved in the substitution; thephysico-chemical characteristics to which the thermostabilityenhancement can be attributed; and the relationship of the replacementsite to the folding intermediates of the protein, when known.From the above analyses, some general rules arise which suggestwhere amino acid substitutions can be made to enhance proteinthermostability: substitutions are conservative according tothe Dayhoff matrix; mainly occur on conserved stretches of residues;preferentially occur on solvent-accessible residues; maintainor enhance the secondary structure propensity upon substitution;contribute to neutralize the dipole moment of the caps of helicesand strands; and tend to increase the number of potential hydrogenbonding or van der Waals contacts or improve hydrophobic packing.  相似文献   

13.
Restriction enzymes (REases) are commercial reagents commonly used in DNA manipulations and mapping. They are regarded as very attractive models for studying protein-DNA interactions and valuable targets for protein engineering. Their amino acid sequences usually show no similarities to other proteins, with rare exceptions of other REases that recognize identical or very similar sequences. Hence, they are extremely hard targets for structure prediction and modeling. NlaIV is a Type II REase, which recognizes the interrupted palindromic sequence GGNNCC (where N indicates any base) and cleaves it in the middle, leaving blunt ends. NlaIV shows no sequence similarity to other proteins and virtually nothing is known about its sequence-structure-function relationships. Using protein fold recognition, we identified a remote relationship between NlaIV and EcoRV, an extensively studied REase, which recognizes the GATATC sequence and whose crystal structure has been determined. Using the 'FRankenstein's monster' approach we constructed a comparative model of NlaIV based on the EcoRV template and used it to predict the catalytic and DNA-binding residues. The model was validated by site-directed mutagenesis and analysis of the activity of the mutants in vivo and in vitro as well as structural characterization of the wild-type enzyme and two mutants by circular dichroism spectroscopy. The structural model of the NlaIV-DNA complex suggests regions of the protein sequence that may interact with the 'non-specific' bases of the target and thus it provides insight into the evolution of sequence specificity in restriction enzymes and may help engineer REases with novel specificities. Before this analysis was carried out, neither the three-dimensional fold of NlaIV, its evolutionary relationships or its catalytic or DNA-binding residues were known. Hence our analysis may be regarded as a paradigm for studies aiming at reducing 'white spaces' on the evolutionary landscape of sequence-function relationships by combining bioinformatics with simple experimental assays.  相似文献   

14.
This paper describes peptide analogs and the design strategythat were used to facilitate the final construction of a denovo-designed protein (ALIN) whose stable tertiary fold hasbeen determined recently by NMR spectroscopy. Previous studieshave suggested that the main problem in the de novo design ofproteins is the attainment of a protein with a defined fold.To effectively overcome this mainchain multiconformation problem,three related steps, with experimental evaluation of the designhypotheses for each step, were pursued in the design process.Firstly, 15-residue sequences with experimentally verified highhelicities were selected for the helical regions. Secondly,hydrophobic and electrostatic interhelical interactions as wellas an interhelical disulfide bridge were designed to favor anantiparallel configuration of the helix axis. Finally, a loopwith sufficient flexibility was inserted to stabilize the helicesin the desired orientation. To assess the design strategy, peptidescorresponding to each design step were synthesized and theirstructures verified experimentally by far-UV CD. As anticipated,ALIN was the most helical, and the SSbridged dimeric peptideswere more helical than their monomeric counterparts. The van'tHoff enthalpy change for ALIN computed from the CD denaturationcurve and assuming a two-state model was 50 kJ/mol, a valueclose to that observed for helical coiled-coils. Overall, thisreport shows that small, simple proteins can be built usingthe current knowledge of protein structures.  相似文献   

15.
Evaluation and improvements in the automatic alignment of protein sequences   总被引:1,自引:0,他引:1  
The accuracy of protein sequence alignment obtained by applyinga commonly used global sequence comparison algorithm is assessed.Alignments based on the superposition of the three-dimensionalstructures are used as a standard for testing the automatic,sequence-based methods. Alignments obtained from the globalcomparison of five pairs of homologous protein sequences studiedgave 54% agreement overall for residues in secondary structures.The inclusion of information about the secondary structure ofone of the proteins in order to limit the number of gaps insertedin regions of secondary structure, improved this figure to 68%.A similarity score of greater than six standard deviation unitssuggests that an alignment which is greater than 75% correctwithin secondary structural regions can be obtained automaticallyfor the pair of sequences.  相似文献   

16.
Variable gap penalty for protein sequence-structure alignment   总被引:1,自引:0,他引:1  
The penalty for inserting gaps into an alignment between two protein sequences is a major determinant of the alignment accuracy. Here, we present an algorithm for finding a globally optimal alignment by dynamic programming that can use a variable gap penalty (VGP) function of any form. We also describe a specific function that depends on the structural context of an insertion or deletion. It penalizes gaps that are introduced within regions of regular secondary structure, buried regions, straight segments and also between two spatially distant residues. The parameters of the penalty function were optimized on a set of 240 sequence pairs of known structure, spanning the sequence identity range of 20-40%. We then tested the algorithm on another set of 238 sequence pairs of known structures. The use of the VGP function increases the number of correctly aligned residues from 81.0 to 84.5% in comparison with the optimized affine gap penalty function; this difference is statistically significant according to Student's t-test. We estimate that the new algorithm allows us to produce comparative models with an additional approximately 7 million accurately modeled residues in the approximately 1.1 million proteins that are detectably related to a known structure.  相似文献   

17.
A comparison has been made between the homology and hydrophobkityprofiles of six interleukin amino add sequences and that ofthe human interleukin 1ß (IL-lß) for whicha crystal structure exists. The resulting sequence alignmentwas used to build model structures for the sequences for threeIL-l, two IL-1ß and an interleukin receptor antagonist.Analysis of these structures demonstrates that the interleukinmolecule has a strong electric dipole which is generated bythe topological position of the amino acids in the sequence.Electrostatic surface calculations implicate a particular residues(Lysl45) as being fundamental to interleukin activity and thissupports site-directed mutation evidence that this residue isrequired for activity.  相似文献   

18.
The variable domain resurfacing and CDR-grafting approachesto antibody humanization were compared directly on the two murinemonoclonal antibodies N901 (anti-CD56) and anti-B4 (anti-CD19).Resurfacing replaces the set of surface residues of a rodentvariable region with a human set of surface residues. The methodof CDR-grafting conceptually consists of transferring the CDRsfrom a rodent antibody onto the Fv framework of a human antibody.Computer-aided molecular modeling was used to design the initialCDR-grafted and resurfaced versions of these two antibodies.The initial versions of resurfaced N901 and resurfaced anti-B4maintained the full binding affinity of the original murineparent antibodies and further refinements to these versionsdescribed herein generated five new resurfaced antibodies thatcontain fewer murine residues at surface positions, four ofwhich also have the full parental binding affinity. A mutationalstudy of three surface positions within 5 Å of the CDRsof resurfaced anti-B4 revealed a remarkable ability of the resurfacedantibodies to maintain binding affinity despite dramatic changesof charges near their antigen recognition surfaces, suggestingthat the resurfacing approach can be used with a high degreeof confidence to design humanized antibodies that maintain thefull parental binding affinity. By comparison CDR-grafted anti-B4antibodies with parental affinity were produced only after seventeenversions were attempted using two different strategies for selectingthe human acceptor frameworks. For both the CDR-grafted anti-B4and N901 antibodies, full restoration of antigen binding affinitywas achieved when the most identical human acceptor frameworkswere selected. The CDR-grafted anti-B4 antibodies that maintainedhigh affinity binding for CD19 had more murine residues at surfacepositions than any of the three versions of the resurfaced anti-B4antibody. This observation suggests that the resurfacing approachcan be used to produce humanized antibodies with reduced antigenicpotential relative to their corresponding CDR-grafted versions.  相似文献   

19.
The directed evolution of proteins has benefited greatly from site-specific methods of diversification such as saturation mutagenesis. These techniques target diversity to a number of chosen positions that are usually non-contiguous in the protein's primary structure. However, the number of targeted positions can be large, thus leading to impractically large library size, wherein almost all library variants are inactive and the likelihood of selecting desirable properties is extremely small. We describe a versatile combinatorial method for the partial diversification of large sets of residues. Our library oligonucleotides comprise randomized codons that are flanked by wild-type sequences. Adding these oligonucleotides to an assembly PCR of wild-type gene fragments incorporates the randomized cassettes, at their target sites, into the reassembled gene. Varying the oligonucleotides concentration resulted in library variants that carry a different average number of mutated positions that comprise a random subset of the entire set of diversified codons. This method, dubbed Incorporating Synthetic Oligos via Gene Reassembly (ISOR), was used to create libraries of a cytosine-C5 methyltransferase wherein 45 individual positions were randomized. One library, containing an average of 5.6 mutated residues per gene, was selected, and mutants with wild-type-like activities isolated. We also created libraries of serum paraoxonase PON1 harboring insertions and deletions (indels) in various areas surrounding the active site. Screening these libraries yielded a range of mutants with altered substrate specificities and indicated that certain regions of this enzyme have a surprisingly high tolerance to indels.  相似文献   

20.
An automated method, based on the principle of simulated annealing,is presented for determining the three-dimensional structuresof proteins on the basis of short (<5 Å) interprotondistance data derived from nuclear Overhauser enhancement (NOE)measurements. The method makes use of Newton's equations ofmotion to increase temporarily the temperature of the systemin order to search for the global minimum region of a targetfunction comprising purely geometric restraints. These consistof interproton distances supplemented by bond lengths, bondangles, planes and soft van der Waals repulsion terms. The latterreplace the dihedral, van der Waals, electrostatic and hydrogen-bondingpotentials of the empirical energy function used in moleculardynamics simulations. The method presented involves the implementationof a number of innovations over our previous restrained moleculardynamics approach [Clore,G.M., Brünger,A.T., Karplus,M.and Gronenborn,A.M. (1986) J. Mol. Biol., 191, 523–551].These include the development of a new effective potential forthe interproton distance restraints whose functional form isdependent on the magnitude of the difference between calculatedand target values, and the design and implementation of robustand fully automatic protocol. The method is tested on threesystems: the model system crambin (46 residues) using X-raystructure derived interproton distance restraints, and potatocarboxypeptidase inhibitor (CPI; 39 residues) and barley serineproteinase inhibitor 2 (BSPI-2; 64 residues) using experimentallyderived interproton distance restraints. Calculations were carriedout starting from the extended strands which had atomic r.m.s.differences of 57, 38 and 33 Å with respect to the crystalstructures of BSPI-2, crambin and CPI respectively. Unbiasedsampling of the conformational space consistent with the restraintswas achieved by varying the random number seed used to assignthe initial velocities. This ensures that the different trajectoriesdiverge during the early stages of the simulations and onlyconverge later as more and more interproton distance restraintsare satisfied. The average backbone atomic r.m.s. differencebetween the converged structures is 2.2 ± 0.3 Åfor crambin (nine structures), 2.4 ± 0.3 Å forCPI (eight structures) and 2.5 ± 0.2 Å for BSPI-2(five structures). The backbone atomic r.m.s. difference betweenthe mean structures derived by averaging the coordinates ofthe converged structures and the corresponding X-ray structuresis 1.2 Å for crambin, 1.6 Å for CPI and 1.7 Åfor BSPI-2.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号