首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
从低同源关系的氨基酸序列预测蛋白质的三维结构被称为从头预测,它是计算生物学领域中的挑战之一.蛋白质骨架预测是从头预测的必要先导步骤.本文应用一种基于共享信息素的并行蚁群优化算法,在现有能量函数指导下,通过不同能量项之间的定性互补,构建具有最低能量的蛋白质骨架结构,并通过聚类选择构象候选集合中具有最低自由能的构象.在CASP8/9所公布的从头建模目标上应用了该方法,CASP8的13个从头建模目标中,模型1中有2个目标的预测结果超过CASP8中最好的结果,7个位列前10名;CASP9的29个从头建模目标中,候选集中的最佳结果中有20个进入Server组的前10名,模型1中有11个进入前10名.本文的结果说明融合多个不同的能量函数指导并行搜索,可以更好地模拟天然蛋白质的折叠行为.同时,在本算法载体上实现了不同种类搜索策略的融合并行,对于用非确定性算法解决类似的优化问题来说也是一种新颖的方法.  相似文献   

2.
从头预测是蛋白质结构建模的一种重要方法,该方法的研究有助于人类理解蛋白质功能,从而进行药物设计和疾病治疗。为了提高预测精度,文中提出了基于接触图残基对距离约束的蛋白质结构预测算法(CDPSP)。基于进化算法框架,CDPSP将构象空间采样分为探索和增强两个阶段。在探索阶段,设计基于残基对距离的变异与选择策略,即根据接触图的接触概率选择残基对,并通过片段组装技术对所选择的残基对的邻近区域进行变异;将残基对距离离散化为多个区域并为其分配期望概率,根据期望概率确定是否选择变异的构象,从而增加种群的多样性。在增强阶段,利用基于接触图信息的评分指标,结合能量函数,衡量构象的质量,从而选择较优的构象,达到增强CDPSP近天然态区域采样能力的效果。为了验证所提算法的性能,通过CASP12中的10个FM组目标蛋白质对其进行了测试,并将其与一些先进算法进行比较。实验结果表明,CDPSP可以预测得到精度较高的蛋白质三维结构模型。  相似文献   

3.
The prediction of secondary structure is an important topic in the field of bioinformatics, even if the methods have matured, and development of the algorithms is a far less active area than a decade ago. Accurate prediction is very useful to biologists in its own right, but it is worth pointing out that it is also an essential component of tertiary structure prediction, which in contrast is far from solved and continues to be a highly active area of research. In addition, sequence comparison methods have more recently incorporated local structure tracks. The extra information utilized by the new methods has led to considerable improvements in fold recognition and alignment accuracy. In this paper, a novel method for protein secondary structure prediction is presented. Using evolutionary information contained in amino acid’s physicochemical properties, position-specific scoring matrix generated by PSI-BLAST and HMMER3 profiles as input to hybrid back propagation system, secondary structure can be predicted at significantly increased accuracy. Based on knowledge discovery theory based on inner cognitive mechanism (KDTICM) theory, we have constructed a compound pyramid model approach, which is composed of four layers of the intelligent interface and integrated in several ways, such as hybrid back propagation method (HBP), modified knowledge discovery in databases (KDD*), hybrid SVM method (HSVM) and so on. Experiments on three standard datasets (RS126, CB513 and CASP8) show that CPM is capable of producing the higher Q 3 and SOV scores than that achieved by existing widely used schemes such as PSIPRED, PHD, Predator, as well as previously developed prediction methods. On the RS126 and CB513 datasets, it achieves a Q 3 and SOV99 score are considerably higher than the best reported scores, respectively. It is also tested on target proteins of critical assessment of protein structure prediction experiment (CASP8) and achieves better results than the traditional methods, including the popular PSIPRED method over overall prediction accuracy. Available: .  相似文献   

4.
An analysis of different approaches to protein structure prediction is presented based solely on the range of models submitted to the third Critical Assessment of Protein Structure Prediction (CASP3) conference. CASP conferences evaluate the current state of the art of protein structure prediction by comparing blind prediction efforts of many groups for the same set of target sequences. Target sequences may be highly similar to those with known structure or can be totally (at least superficially) sequentially dissimilar. Techniques applied to those blind predictions (over 40 targets) ranges from a detailed homology prediction to the detection of remote homologues well below a twilight zone of protein sequence similarity. For the CASP3 conference, we have submitted predictions, totaling 35, with various levels of difficulty and complexity. For ten submitted homology targets, eight of them were determined by experiment so far. The RMSD of C-alpha atoms are 1.2-1.7, 2.3, and 4.6-17.9 A for the three easy targets, two hard targets, and three very hard homology targets, respectively. Out of 18-fold recognition predictions available for analysis, we got six correct predictions, five near misses, three tough near misses and four far misses. Here we analyze successes and failures of those predictions in an attempt to identify common problems and common achievements.  相似文献   

5.
After the atomic coordinates themselves, the most important data in a homology model are the spatial reliability estimates associated with each of the atoms (atom annotation). Recent blind homology modeling predictions have demonstrated that principally correct sequence-structure alignments are achievable to sequence identities as low as 25% [Martin, A.C., MacArthur, M.W., Thornton, J.M., 1997. Assessment of comparative modeling in CASP2. Proteins Suppl(1), 14-28]. The locations and extent of spatial deviations in the backbone between correctly aligned homologous protein structures remained very poorly estimated however, and these errors were the cause of errant loop predictions [Abagyan, R., Batalov, S., Cardozo, T., Totrov, M., Webber, J., Zhou, Y., 1997. Homology modeling with internal coordinate mechanics: deformation zone mapping and improvements of models via conformational search. Proteins Suppl(1), 29-37]. In order to derive accurate measures for local backbone deviations, we made a systematic study of static local backbone deviations between homologous pairs of protein structures. We found that 'through space' proximity to gaps and chain termini, local three-dimensional 'density', three-dimensional environment conservation, and B-factor of the template contribute to local deviations in the backbone in addition to local sequence identity. Based on these finding, we have identified the meaningful ranges of values within which each of these parameters correlates with static local backbone deviation and produced a combined scoring function to greatly improve the estimation of local backbone deviations. The optimized function has more than twice the accuracy of local sequence identity or B-factor alone and was validated in a recent blind structure prediction experiment. This method may be used to evaluate the utility of a preliminary homology model for a particular biological investigation (e.g. drug design) or to provide an improved starting point for molecular mechanics loop prediction methods.  相似文献   

6.
Accurate protein secondary structure prediction (PSSP) is essential to identify structural classes, protein folds, and its tertiary structure. To identify the secondary structure, experimental methods exhibit higher precision with the trade-off of high cost and time. In this study, we propose an effective prediction model which consists of hybrid features of 42-dimensions with the combination of convolutional neural network (CNN) and bidirectional recurrent neural network (BRNN). The proposed model is accessed on four benchmark datasets such as CB6133, CB513, CASP10, and CAP11 using Q3, Q8, and segment overlap (Sov) metrics. The proposed model reported Q3 accuracy of 85.4%, 85.4%, 83.7%, 81.5%, and Q8 accuracy 75.8%, 73.5%, 72.2%, and 70% on CB6133, CB513, CASP10, and CAP11 datasets respectively. The results of the proposed model are improved by a minimum factor of 2.5% and 2.1% in Q3 and Q8 accuracy respectively, as compared to the popular existing models on CB513 dataset. Further, the quality of the Q3 results is validated by structural class prediction and compared with PSI-PRED. The experiment showed that the quality of the Q3 results of the proposed model is higher than that of PSI-PRED.  相似文献   

7.
蛋白质是由多个氨基酸残基顺序连接而成的长链.在天然状态下,蛋白质并不是无规则的自由状态,而是自发形成特定的空间结构,以执行其特定的生物学功能.驱动蛋白质形成特定空间结构的主要因素是残基间的非共价相互作用,包括疏水作用、静电相互作用、范德华力等.因此,对残基之间远程相互作用的准确预测将有助于对蛋白质空间结构的预测,进而有助于对蛋白质生物学功能的了解.在蛋白质进化过程,有相互作用残基对之间存在一种“共进化”模式,即当一个残基发生变异时,与其有相互作用的残基也要发生相应的变异,以维持相互作用,进而维持整体空间结构以及生物学功能.基于上述生物学观察,研究者开发了多个统计模型和算法以预测残基对之间的相互作用:1)概述残基之间远程相互作用的两大类基本预测算法,包括无监督学习方法和监督学习方法;2)使用蛋白质结构预测CASP比赛结果来客观比较上述各类算法的性能,分析各个算法的特点和优势;3)从生物学观察和统计模型2个角度分析总结了未来的发展趋势.  相似文献   

8.
We present an approach for building protein backbones from alpha-carbon (Calpha) coordinates. The approach is analytical and based on the information of favored regions in the Ramachandran map. The backbone construction consists of three parts: prediction of (phi, psi) angle pairs from the Calpha trace, generation of atomic coordinates with these (phi, psi) angles, and refinement by subsequent energy minimization. Tests on several known protein structures show that the root mean square deviations in reconstructed backbones are 0.25-0.48 A for coordinates and 14-34 degrees for phi and psi angles. The results indicate that our method is one of the best methods proposed in terms of accuracy. It has also been revealed that the approach is not only robust against errors in Calpha coordinates but is also capable of providing equivalent or more reasonable models compared to other known methods. Furthermore, backbone structures were found to be built accurately by using the (phi, psi) angles from a different structure of the same protein. This suggests that the approach could be effective and useful in homology modeling studies.  相似文献   

9.
One of the key elements in protein structure prediction is the ability to distinguish between good and bad candidate structures. This distinction is made by estimation of the structure energy. The energy function used in the best state-of-the-art automatic predictors competing in the most recent CASP (Critical Assessment of Techniques for Protein Structure Prediction) experiment is defined as a weighted sum of a set of energy terms designed by experts. We hypothesised that combining these terms more freely will improve the prediction quality. To test this hypothesis, we designed a genetic programming algorithm to evolve the protein energy function. We compared the predictive power of the best evolved function and a linear combination of energy terms featuring weights optimised by the Nelder–Mead algorithm. The GP based optimisation outperformed the optimised linear function. We have made the data used in our experiments publicly available in order to encourage others to further investigate this challenging problem by using GP and other methods, and to attempt to improve on the results presented here.  相似文献   

10.
Accurate protein secondary structure prediction plays an important role in direct tertiary structure modeling, and can also significantly improve sequence analysis and sequence-structure threading for structure and function determination. Hence improving the accuracy of secondary structure prediction is essential for future developments throughout the field of protein research.In this article, we propose a mixed-modal support vector machine (SVM) method for predicting protein secondary structure. Using the evolutionary information contained in the physicochemical properties of each amino acid and a position-specific scoring matrix generated by a PSI-BLAST multiple sequence alignment as input for a mixed-modal SVM, secondary structure can be predicted at significantly increased accuracy. Using a Knowledge Discovery Theory based on the Inner Cognitive Mechanism (KDTICM) method, we have proposed a compound pyramid model, which is composed of three layers of intelligent interface that integrate a mixed-modal SVM (MMS) module, a modified Knowledge Discovery in Databases (KDD1) process, a mixed-modal back propagation neural network (MMBP) module and so on.Testing against data sets of non-redundant protein sequences returned values for the Q3 accuracy measure that ranged from 84.0% to 85.6%,while values for the SOV99 segment overlap measure ranged from 79.8% to 80.6%. When compared using a blind test dataset from the CASP8 meeting against currently available secondary structure prediction methods, our new approach shows superior accuracy.Availability: http://www.kdd.ustb.edu.cn/protein_Web/.  相似文献   

11.
Performance prediction is an important engineering tool that provides valuable feedback on design choices in program synthesis and machine architecture development. We present an analytic performance modeling approach aimed to minimize prediction cost, while providing a prediction accuracy that is sufficient to enable major code and data mapping decisions. Our approach is based on a performance simulation language called PAMELA. Apart from simulation, PAMELA features a symbolic analysis technique that enables PAMELA models to be compiled into symbolic performance models that trade prediction accuracy for the lowest possible solution cost. We demonstrate our approach through a large number of theoretical and practical modeling case studies, including six parallel programs and two distributed-memory machines. The average prediction error of our approach is less than 10 percent, while the average worst-case error is limited to 50 percent. It is shown that this accuracy is sufficient to correctly select the best coding or partitioning strategy. For programs expressed in a high-level, structured programming model, such as data-parallel programs, symbolic performance modeling can be entirely automated. We report on experiments with a PAMELA model generator built within a dataparallel compiler for distributed-memory machines. Our results show that with negligible program annotation, symbolic performance models are automatically compiled in seconds, while their solution cost is in the order of milliseconds.  相似文献   

12.
A current development trend in research on intelligent systems is to optimize a general intelligent prediction system into an individuation intelligent prediction system that is applied in specialized fields. Protein structure prediction is a challenging international issue. In this paper, we propose a new intelligent prediction system model, designed as a multi-layer compound pyramid model, for predicting secondary protein structure. The model comprises four independent intelligent interfaces and several knowledge discovery methods. The model penetrates throughout the domain knowledge, with the effective attributes chosen by Causal Cellular Automata. Furthermore, a high pure structure database is constructed for training. On the RS126 dataset, the overall state per-residue accuracy, Q3, reached 83.99%, while on the CB513 dataset, Q3 reached 85.58%. Meanwhile, on the CASP8 sequences, the results are superior to those produced by other methods, such as Psipred, Jpred, APSSP2 and BehairPred. These results confirm that our method has a strong generalization ability, and that it provides a model for the construction of other intelligent systems.  相似文献   

13.
《Information Fusion》2009,10(3):217-232
Protein secondary structure prediction is still a challenging problem at today. Even if a number of prediction methods have been presented in the literature, the various prediction tools that are available on-line produce results whose quality is not always fully satisfactory. Therefore, a user has to know which predictor to use for a given protein to be analyzed. In this paper, we propose a server implementing a method to improve the accuracy in protein secondary structure prediction. The method is based on integrating the prediction results computed by some available on-line prediction tools to obtain a combined prediction of higher quality. Given an input protein p whose secondary structure has to be predicted, and a group of proteins F, whose secondary structures are known, the server currently works according to a two phase approach: (i) it selects a set of predictors good at predicting the secondary structure of proteins in F (and, therefore, supposedly, that of p as well), and (ii) it integrates the prediction results delivered for p by the selected team of prediction tools. Therefore, by exploiting our system, the user is relieved of the burden of selecting the most appropriate predictor for the given input protein being, at the same time, assumed that a prediction result at least as good as the best available one will be delivered. The correctness of the resulting prediction is measured referring to EVA accuracy parameters used in several editions of CASP.  相似文献   

14.
Operational optimization of ocean vessels, both off-line and in real-time, is becoming increasingly important due to rising fuel cost and added environmental constraints. Accurate and efficient simulation models are needed to achieve maximum energy efficiency. In this paper a grey-box modeling approach for the simulation of ocean vessels is presented. The modeling approach combines conventional analysis models based on physical principles (a white-box model) with a feed forward neural-network (a black-box model). Two different ways of combining these models are presented, in series and in parallel. The results of simulating several trips of a medium sized container vessel show that the grey-box modeling approach, both serial and parallel approaches, can improve the prediction of the vessel fuel consumption significantly compared to a white-box model. However, a prediction of the vessel speed is only improved slightly. Furthermore, the results give an indication of the potential advantages of grey-box models, which is extrapolation beyond a given training data set and the incorporation of physical phenomena which are not modeled in the white-box models. Finally, included is a discussion on how to enhance the predictability of the grey-box models as well as updating the neural-network in real-time.  相似文献   

15.
To fully utilize all available information in protein structure prediction, including both backbone and side-chain structures, we present a novel algorithm for solving a generalized threading problem. In this problem, we consider simultaneously backbone threading and side-chain packing during the process of a protein structure prediction. For a given query protein sequence and a template structure, our goal is to find a threading alignment between the query sequence and the template structure, along with a rotamer assignment for each side-chain of the query protein, which optimizes an energy function that combines a backbone threading energy and a side-chain packing energy. This highly computationally challenging problem is solved through first formulating this problem as a graph-based optimization problem. Various graph-theoretic techniques are employed to achieve the computational efficiency to make our algorithm practically useful, which takes advantage of a number of special properties of the graph representing this generalized threading problem. The overall framework of our algorithm is a dynamic programming algorithm implemented on an optimal tree decomposition of the graph representation of our problem. By using various additional heuristic techniques such as the dead-end elimination, we have demonstrated that our algorithm can solve a generalized threading problem within practically acceptable amount of time and space, the first of its kind.  相似文献   

16.
To utilize fully all available information in protein structure prediction, including both backbone and side-chain structures, we present a novel algorithm for solving a generalized threading problem. In this problem we consider simultaneous backbone threading and side-chain packing during the process of a protein structure prediction. For a given query protein sequence and a template structure, our goal is to find a threading alignment between the query sequence and the template structure, along with a rotamer assignment for each side-chain of the query protein, which optimizes an energy function that combines a backbone threading energy and a side-chain packing energy. This highly computationally challenging problem is solved through first formulating this problem as a graph-based optimization problem. Various graph-theoretic techniques are employed to achieve the computational efficiency to make our algorithm practically useful, which takes advantage of a number of special properties of the graph representing this generalized threading problem. The overall framework of our algorithm is a dynamic programming algorithm implemented on an optimal tree decomposition of the graph representation of our problem. By using various additional heuristic techniques such as dead-end elimination, we have demonstrated that our algorithm can solve a generalized threading problem within a practically acceptable amount of time and space, the first of its kind.  相似文献   

17.
These days’ smart buildings have high intensive information and massive operational parameters, not only extensive power consumption. With the development of computation capability and future 5G, the ACP theory (i.e., artificial systems, computational experiments, and parallel computing) will play a much more crucial role in modeling and control of complex systems like commercial and academic buildings. The necessity of making accurate predictions of energy consumption out of a large number of operational parameters has become a crucial problem in smart buildings. Previous attempts have been made to seek energy consumption predictions based on historical data in buildings. However, there are still questions about parallel building consumption prediction mechanism using a large number of operational parameters. This article proposes a novel hybrid deep learning prediction approach that utilizes long short-term memory as an encoder and gated recurrent unit as a decoder in conjunction with ACP theory. The proposed approach is tested and validated by real-world dataset, and the results outperformed traditional predictive models compared in this paper.   相似文献   

18.
Determination of potential drug toxicity and side effect in early stages of drug development is important in reducing the cost and time of drug discovery. In this work, we explore a computer method for predicting potential toxicity and side effect protein targets of a small molecule. A ligand-protein inverse docking approach is used for computer-automated search of a protein cavity database to identify protein targets. This database is developed from protein 3D structures in the protein data bank (PDB). Docking is conducted by a procedure involving multiple conformer shape-matching alignment of a molecule to a cavity followed by molecular-mechanics torsion optimization and energy minimization on both the molecule and the protein residues at the binding region. Potential protein targets are selected by evaluation of molecular mechanics energy and, while applicable, further analysis of its binding competitiveness against other ligands that bind to the same receptor site in at least one PDB entry. Our results on several drugs show that 83% of the experimentally known toxicity and side effect targets for these drugs are predicted. The computer search successfully predicted 38 and missed five experimentally confirmed or implicated protein targets with available structure and in which binding involves no covalent bond. There are additional 30 predicted targets yet to be validated experimentally. Application of this computer approach can potentially facilitate the prediction of toxicity and side effect of a drug or drug lead.  相似文献   

19.
以航空燃气涡轮发动机气路故障诊断为导向,提出了一种用于发动机气路参数预测的特征注意力增强型长短时记忆网络(Feature Attention Enhanced Long Short-Term Memory Network, FAE-LSTM)。FAE-LSTM是具有编码-解码结构的动态网络,首先通过编码器中的特征注意力单元对工况序列进行动态特征提取,然后通过特征拼接层融合编码器输出序列、工况序列和历史性能参数,最后通过解码器实现最终的参数预测。FAE-LSTM基于发动机飞行过程数据建立发动机在健康状态下的动态模型,从而作为参数预测模型应用于基于残差的故障诊断系统中。针对网络的预测性能和应用方式进行了仿真分析,结果表明,相比于其他常用多变量时间序列预测模型,FAE-LSTM的长期预测误差最低减少24.5%;相比于使用串-并联结构,故障检测系统使用并联结构的FAE-LSTM网络能够获得更精确的检测结果。  相似文献   

20.
When a tsunami occurred on a sea area, prediction of its arrival time is critical for evacuating people from the coastal area. There are many problems related to tsunami to be solved for reducing negative effects of this serious disaster. Numerical modeling of tsunami wave propagation is a computationally intensive problem which needs to accelerate its calculations by parallel processing. The method of splitting tsunami (MOST) is one of the well-known numerical solvers for tsunami modeling. We have developed a tsunami propagation code based on MOST algorithm and implemented different parallel optimizations for GPU and FPGA. In the latest study, we have the best performance of OpenCL kernel which is implemented tsunami simulation on AMD Radeon 280X GPU. This paper targets on design and evaluation on FPGA using OpenCL. The performance on FPGA design generated automatically by Altera offline compiler follows the results of GPU by several kernel modifications.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号