首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this study, the authors studied the protein structure prediction problem by the two‐dimensional hydrophobic–polar model on triangular lattice. Particularly the non‐compact conformation was modelled to fold the amino acid sequence into a relatively larger triangular lattice, which is more biologically realistic and significant than the compact conformation. Then protein structure prediction problem was abstracted to match amino acids to lattice points. Mathematically, the problem was formulated as an integer programming and they transformed the biological problem into an optimisation problem. To solve this problem, classical particle swarm optimisation algorithm was extended by the single point adjustment strategy. Compared with square lattice, conformations on triangular lattice are more flexible in several benchmark examples. They further compared the authors’ algorithm with hybrid of hill climbing and genetic algorithm. The results showed that their method was more effective in finding solution with lower energy and less running time.Inspec keywords: proteins, molecular biophysics, molecular configurations, particle swarm optimisation, bioinformaticsOther keywords: extended particle swarm optimisation method, triangular lattice, protein structure prediction problem, two‐dimensional hydrophobic–polar model, noncompact conformation, amino acid sequence, single point adjustment strategy, protein folding  相似文献   

2.
We propose a nucleation hypothesis for protein folding. Based on this hypothesis, we have designed a new nearest-neighbor method for the prediction of protein secondary structures, in which the reliability of each prediction is estimated based on sequence conservation and clustering in the databases of known structures. We have found that predictions with higher reliability scores were indeed correlated with a higher predictive accuracy. We also found that by selecting the top 20% of residues based on reliability scores as nucleation residues, a clear pattern emerged where hydrophobic amino acids were largely buried and hydrophilic amino acids were more exposed. This was consistent with the widely accepted HP-model for protein folding. These results were true for several databases, such as PDBSELECT (<25% sequence homology), SCOP-ASTRAL (<25% sequence homology), and SCOP-ASTRAL unique fold classes, with 1300, 3956, and 762 proteins, respectively. Therefore, it is conceivable that the nucleation residues function not only as initiation sites for folding, but also as the core residues playing a primary role in determining the protein (thermodynamic) stability. The occurrence of these two functions on one set of amino acid residues in a protein is perhaps from the result of biological evolution. Finally, we have found power law behaviors in our results, whose scaling properties were modeled using polymer physics and critical phenomena. It is concluded that proteins behave like a real chain. A new physical picture for protein folding derived from our nucleation hypothesis has been described, in which there are two continuous phase transitions corresponding to the two stages of protein folding: one is nucleation and the other collapse. By determining the critical exponents of these two-phase transitions, it has been found that the nucleation process has a spatial dimension of d=3 while the collapse process of d=2.  相似文献   

3.
Recombinant protein–inorganic nanocomposites comprised of exfoliated Na+ montmorillonite (MMT) in a recombinant protein matrix based on silk-like and elastin-like amino acid motifs (silk elastin-like protein (SELP)) were formed via a solution blending process. Charged residues along the protein backbone are shown to dominate long-range interactions, whereas the SELP repeat sequence leads to local protein/MMT compatibility. Up to a 50% increase in room temperature modulus and a comparable decrease in high temperature coefficient of thermal expansion occur for cast films containing 2–10 wt.% MMT.  相似文献   

4.
The classification of protein structures is essential for their function determination in bioinformatics. At present, a reasonably high rate of prediction accuracy has been achieved in classifying proteins into four classes in the SCOP database according to their primary amino acid sequences. However, for further classification into fine-grained folding categories, especially when the number of possible folding patterns as those defined in the SCOP database is large, it is still quite a challenge. In our previous work, we have proposed a two-level classification strategy called hierarchical learning architecture (HLA) using neural networks and two indirect coding features to differentiate proteins according to their classes and folding patterns, which achieved an accuracy rate of 65.5%. In this paper, we use a combinatorial fusion technique to facilitate feature selection and combination for improving predictive accuracy in protein structure classification. When applying various criteria in combinatorial fusion to the protein fold prediction approach using neural networks with HLA and the radial basis function network (RBFN), the resulting classification has an overall prediction accuracy rate of 87% for four classes and 69.6% for 27 folding categories. These rates are significantly higher than the accuracy rate of 56.5% previously obtained by Ding and Dubchak. Our results demonstrate that data fusion is a viable method for feature selection and combination in the prediction and classification of protein structure.  相似文献   

5.
The procedures used to model a protein structure are well established when the novel protein has high sequence similarity to a protein of known structure. Many proteins of interest have low (i.e. <50%) sequence similarity to any known structure. In these cases new approaches to prediction of structure are required.The use of sequence profiles which relate sequence to known structure has been proposed as one method to assign local regions of structure. As a first stage, templates or “icons” of the many relevant substructural motifs found in proteins must be defined. The sequences which gave rise to these structures are then aligned and a weighted profile obtained.Average structures of the 8 and 12 residue helix-turn and turn-helix motifs have been prepared. These coordinate templates were then used to scan through the Brookhaven protein structural database for similar, superimposable fragments. A composite template of 100 similar fragments for each element was found to be internally consistent to a rmsd=0.92 Å for HT8, 1.54 Å for HT12, 0.41 Å for TH8 and 1.40 Å for TH12. All of the sequences, from these structures, were then used to create an overall sequence profile.The four sequence profiles were scanned against the amino acid sequences of the proteins in the Brookhaven database: tertiary structure was correctly identified only about 10% of the time. This value is too low for predictive purposes. However, it could be increased by checking for multiple occurrences of the template in one protein.  相似文献   

6.
Covalent disulfide bond linkage in a protein represents an important challenge for mass spectrometry (MS)-based top-down protein structure analysis as it reduces the backbone cleavage efficiency for MS/MS dissociation. This study presents a strategy for solving this critical issue via integrating electrochemistry (EC) online with a top-down MS approach. In this approach, proteins undergo electrolytic reduction in an electrochemical cell to break disulfide bonds and then undergo online ionization into gaseous ions for analysis by electron-capture dissociation (ECD) and collision-induced dissociation (CID). The electrochemical reduction of proteins allows one to remove disulfide bond constraints and also leads to increased charge numbers of the resulting protein ions. As a result, sequence coverage was significantly enhanced, as exemplified by β-lactoglobulin A (24 vs 75 backbone cleavages before and after electrolytic reduction, respectively) and lysozyme (5 vs 66 backbone cleavages before and after electrolytic reduction, respectively). This methodology is fast and does not need chemical reductants, which would have an important impact in high-throughput proteomics research.  相似文献   

7.
Electron capture dissociation (ECD) has previously been shown by other research groups to result in greater peptide sequence coverage than other ion dissociation techniques and to localize labile posttranslational modifications. Here, ECD has been achieved for 10-13-mer peptides microelectrosprayed from 10 nM (10 fmol/microL) solutions and for tryptic peptides from a 50 nM unfractionated digest of a 28-kDa protein. Tandem Fourier transform ion cyclotron resonance (FTICR) mass spectra contain fragment ions corresponding to cleavages at all possible peptide backbone amine bonds, except on the N-terminal side of proline, for substance P and neurotensin. For luteinizing hormone-releasing hormone, all but two expected backbone amine bond cleavages are observed. The tandem FTICR mass spectra of the tryptic peptides contain fragment ions corresponding to cleavages at 6 of 12 (1545.7-Da peptide) and 8 of 21 (2944.5-Da peptide) expected backbone amine bonds. The present sensitivity is 200-2000 times higher than previously reported. These results show promise for ECD as a tool to produce sequence tags for identification of peptides in complex mixtures available only in limited amounts, as in proteomics.  相似文献   

8.
A method for mass spectrometric peptide mapping was developed, based on hydrolysis of a solid protein by acid vapor followed by mass spectrometric analysis of the cleavage products. The method is applicable to lyophilized samples as well as proteins present in gels after separation by SDS-PAGE. The cleavage specificity was established using a number of standard proteins. Three different types of cleavages were observed: specific internal backbone cleavages at Asp, Ser, Thr, and Gly and N- and C-terminal sequence ladders. On the basis of the observed cleavage characteristics, a strategy for protein identification based on the peptide mass maps was developed. The identification strategy utilizes the specific internal backbone cleavages as well as the partial sequence information, obtained from the sequence ladders.  相似文献   

9.
The protein structure prediction (PSP) problem is concerned with the prediction of the folded, native, tertiary structure of a protein given its sequence of amino acids. It is a challenging and computationally open problem, as proven by the numerous methodological attempts and the research effort applied to it in the last few years. The potential energy functions used in the literature to evaluate the conformation of a protein are based on the calculations of two different interaction energies: local (bond atoms) and non-local (non-bond atoms). In this paper, we show experimentally that those types of interactions are in conflict, and do so by using the potential energy function Chemistry at HARvard Macromolecular Mechanics. A multi-objective formulation of the PSP problem is introduced and its applicability studied. We use a multi-objective evolutionary algorithm as a search procedure for exploring the conformational space of the PSP problem.  相似文献   

10.
The structure classification of proteins plays a very important role in bioinformatics, since the relationships and characteristics among those known proteins can be exploited to predict the structure of new proteins. The success of a classification system depends heavily on two things: the tools being used and the features considered. For the bioinformatics applications, the role of appropriate features has not been paid adequate importance. In this investigation we use three novel ideas for multiclass protein fold classification. First, we use the gating neural network, where each input node is associated with a gate. This network can select important features in an online manner when the learning goes on. At the beginning of the training, all gates are almost closed, i.e., no feature is allowed to enter the network. Through the training, gates corresponding to good features are completely opened while gates corresponding to bad features are closed more tightly, and some gates may be partially open. The second novel idea is to use a hierarchical learning architecture (HLA). The classifier in the first level of HLA classifies the protein features into four major classes: all alpha, all beta, alpha + beta, and alpha/beta. And in the next level we have another set of classifiers, which further classifies the protein features into 27 folds. The third novel idea is to induce the indirect coding features from the amino-acid composition sequence of proteins based on the N-gram concept. This provides us with more representative and discriminative new local features of protein sequences for multiclass protein fold classification. The proposed HLA with new indirect coding features increases the protein fold classification accuracy by about 12%. Moreover, the gating neural network is found to reduce the number of features drastically. Using only half of the original features selected by the gating neural network can reach comparable test accuracy as that using all the original features. The gating mechanism also helps us to get a better insight into the folding process of proteins. For example, tracking the evolution of different gates we can find which characteristics (features) of the data are more important for the folding process. And, of course, it also reduces the computation time.  相似文献   

11.
Thermodynamically, polymer crystallization is a first-order transition that involves overcoming an energy barrier. Building a molecular kinetic model that links this macroscopic concept with experimental observations has been and still remains a difficult issue. It requires a physical picture that can show how a three-dimensionally random linear macromolecule is converted to a chain-folded crystalline state despite the loss of entropy in the process. There are a number of dynamic molecular pathways during polymer crystallization, and previous analytical models have used a 'mean-field' approach. In polymer crystallization, every macromolecule has to go through several selection processes on different length- and time-scales. In this article, we try to identify these selection processes and lay down some basic principles of polymer crystallization. Experimental observations on stem configurations, helical conformations, crystal structures, fold lengths, global macromolecular conformations and lamellar single-crystal morphologies have been used as probes to identify these selection processes.  相似文献   

12.
In previous studies, electron capture dissociation (ECD) has been successful only with ionized smaller proteins, cleaving between 33 of the 153 amino acid pairs of a 17 kDa protein. This has been increased to 99 cleavages by colliding the ions with a background gas while subjecting them to electron capture. Presumably this ion activation breaks intramolecular noncovalent bonds of the ion's secondary and tertiary structure that otherwise prevent separation of the products from the nonergodic ECD cleavage of a backbone covalent bond. In comparison to collisionally activated dissociation, this "activated ion" (AI) ECD provides more extensive, and complementary, sequence information. AI ECD effected cleavage of 116, 60, and 47, respectively, backbone bonds in 29, 30, and 42 kDa proteins to provide extensive contiguous sequence information on both termini; AI conditions are being sought to denature the center portion of these large ions. This accurate "sequence tag" information could potentially identify individual proteins in mixtures at far lower sample levels than methods requiring prior proteolysis.  相似文献   

13.
The accurate and stable prediction of protein domain boundaries is an important avenue for the prediction of protein structure, function, evolution, and design. Recent research on protein domain boundary prediction has been mainly based on widely known machine learning techniques. In this paper, we propose a new machine learning based domain predictor namely, DomNet that can show a more accurate and stable predictive performance than the existing state-of-the-art models. The DomNet is trained using a novel compact domain profile, secondary structure, solvent accessibility information, and interdomain linker index to detect possible domain boundaries for a target sequence. The performance of the proposed model was compared to nine different machine learning models on the Benchmark_2 dataset in terms of accuracy, sensitivity, specificity, and correlation coefficient. The DomNet achieved the best performance with 71% accuracy for domain boundary identification in multidomains proteins. With the CASP7 benchmark dataset, it again demonstrated superior performance to contemporary domain boundary predictors such as DOMpro, DomPred, DomSSEA, DomCut, and DomainDiscovery.  相似文献   

14.
A novel vibrational spectroscopic approach has been developed to better understand the structure activity relationship (SAR) component of the drug discovery effort. First, vibrational spectroscopy has been developed as a tool for the identification of molecular subcomponents within a compound series, which play an active role in binding kinetics. Second, vibrational spectroscopy has exhibited utility in uncovering electronic trends within both pendant functional groups and within the molecular backbone scaffold, which foster the binding process. In this study, three series of compounds (isoflavone, coumarin, and benzoxazole) within a human estrogen receptor (ER-beta) study were used to explore the feasibility of the technique. In each series, infrared and Raman band shifts, which correlated with ER-beta binding activity, were identified. Data indicated binding activity to be strongly influenced by electron density in the pi-bonding system of the backbone scaffold for each series. The ability to relate the physical/electronic state of a molecular subcomponent to activity has far-reaching implications in the identification and optimization of binding activity. The preliminary success of this project opens the door to a much broader investigation of the technique using a host of spectroscopic methodologies.  相似文献   

15.
Mass spectrometry has recently become one of the major analytical tools to study biomolecular structure and function. Ionization techniques, such as electrospray ionization (ESI), desorb biomolecules from solution to the gas phase keeping practically intact their natural structure. ESI applied to a protein solution produces a mixture of multiply charged ions, the ion charge distribution of which depends on the oligomeric form (mass) and on the protein surface exposed (amount of accommodated charges) of the related protein conformation. ESI-MS provides an efficient way to monitor protein processes; however, the ionic contributions of the different protein conformations involved usually overlap, and the use of chemometric tools is necessary to unravel the information related to the pure conformations that the biomolecule adopts along the process. Multivariate curve resolution-alternating least squares applied to MS-monitored protein processes provides the concentration profiles associated with the different protein conformations occurring during the process and the related pure mass spectra. The concentration profiles, in this context, the ionic contributions, describe the process mechanism and the structural information derived from the pure mass spectra characterizes the involved conformations. Mass spectra can be expressed schematically through percentages of base peak intensity. This chemical transformation compresses significantly the raw spectra and allows for an easier application of natural MS-related constraints, such as the presence of only one maximum, i.e., the base peak of a particular conformation, into the resolution of the pure signals. The combination of mass spectrometry and multivariate curve resolution methods is used to elucidate the mechanism of the pH-induced conformation changes of the bovine beta-lactoglobulin. As a final step, MS data are fused with circular dichroism data and are simultaneously analyzed to ensure and confirm that all the previously detected MS conformations really exist in solution and are an artifact of neither the ionization process nor their chemometric resolution.  相似文献   

16.
Thermally induced protein unfolding/folding processes have been studied on alpha-lactalbumin and alpha-apolactalbumin. Experiments monitored by fluorescence and circular dichroism spectroscopic techniques on alpha-apolactalbumin showed the formation of an intermediate species, whereas in the case of alpha-lactalbumin, this intermediate species was not detected. The presence and resolution of this intermediate species, its spectrum, and the evolution of all conformations during protein unfolding/folding processes were estimated using the multivariate curve resolution-alternating least-squares method. Elucidation of the nature and contribution of the different secondary structure motifs in each of the resolved protein conformations, including the intermediate, was also carried out. Multivariate resolution has shown to be an excellent tool for the complete characterization of all protein conformations involved in folding processes, including intermediate species that cannot be isolated by physical or chemical means. Indeed, it is in the determination and modeling of these intermediates that this chemometric approach outperforms in power and reliability previous methodologies based on simpler measurements and data treatments and fills the void linked to the elucidation and interpretation of complex mechanisms in protein folding processes.  相似文献   

17.
Recently, an approach for the "top down" sequence analysis of whole protein ions has been developed, employing electrospray ionization, collision-induced dissociation, and ion/ion proton-transfer reactions in a quadrupole ion trap mass spectrometer. This approach has now been extended to an analysis of the [M + 12H]12+ to [M + 5H]5+ ions of ribonuclease A and its N-linked glycosylated analogue, ribonuclease B, to determine the influence of the posttranslational modification on protein fragmentation. In agreement with previous studies on the fragmentation of a range of protein ions, facile gas-phase fragmentation was observed to occur along the protein backbone at the C-terminal of aspartic acid residues, and at the N-terminal of proline, depending on the precursor ion charge state. Interestingly, no evidence was found for gas-phase deglycosylation of the N-linked sugar in ribonuclease B, presumably due to effective competition from the facile amide bond cleavage channels that "protect" the N-linked glycosidic bond from cleavage. Thus, localization of the posttranslational modification site may be determined by analysis of the "protein fragment ion mass fingerprint".  相似文献   

18.
For the prediction of life leading to fatigue crack initiation, a method for performing a cycle-by-cycle local stress analysis at the stress concentration area of a structural component was developed. Elastoplastic stress-strain values along the hysteresis loop are traced for each load reversal in making the life prediction calculations. In this manner, the load sequence effect and the residual stress due to local yielding are inherently included. Neuber's rule and a linear rule were used with this method and compared. The results of life prediction were compared with test results. The use of the linear rule provided more accurate predictions than using other alternatives, including Miner's rule.  相似文献   

19.
Zhang Z  Zhang A  Xiao G 《Analytical chemistry》2012,84(11):4942-4949
Protein hydrogen/deuterium exchange (HDX) followed by protease digestion and mass spectrometric (MS) analysis is accepted as a standard method for studying protein conformation and conformational dynamics. In this article, an improved HDX MS platform with fully automated data processing is described. The platform significantly reduces systematic and random errors in the measurement by introducing two types of corrections in HDX data analysis. First, a mixture of short peptides with fast HDX rates is introduced as internal standards to adjust the variations in the extent of back exchange from run to run. Second, a designed unique peptide (PPPI) with slow intrinsic HDX rate is employed as another internal standard to reflect the possible differences in protein intrinsic HDX rates when protein conformations at different solution conditions are compared. HDX data processing is achieved with a comprehensive HDX model to simulate the deuterium labeling and back exchange process. The HDX model is implemented into the in-house developed software MassAnalyzer and enables fully unattended analysis of the entire protein HDX MS data set starting from ion detection and peptide identification to final processed HDX output, typically within 1 day. The final output of the automated data processing is a set (or the average) of the most possible protection factors for each backbone amide hydrogen. The utility of the HDX MS platform is demonstrated by exploring the conformational transition of a monoclonal antibody by increasing concentrations of guanidine.  相似文献   

20.
Recommender systems are rapidly transforming the digital world into intelligent information hubs. The valuable context information associated with the users’ prior transactions has played a vital role in determining the user preferences for items or rating prediction. It has been a hot research topic in collaborative filtering-based recommender systems for the last two decades. This paper presents a novel Context Based Rating Prediction (CBRP) model with a unique similarity scoring estimation method. The proposed algorithm computes a context score for each candidate user to construct a similarity pool for the given subject user-item pair and intuitively choose the highly influential users to forecast the item ratings. The context scoring strategy has an inherent capability to incorporate multiple conditional factors to filter down the most relevant recommendations. Compared with traditional similarity estimation methods, CBRP makes it possible for the full use of neighboring collaborators’ choice on various conditions. We conduct experiments on three publicly available datasets to evaluate our proposed method with random user-item pairs and got considerable improvement in prediction accuracy over the standard evaluation measures. Also, we evaluate prediction accuracy for every user-item pair in the system and the results show that our proposed framework has outperformed existing methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号