首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Rolf Backofen 《Constraints》2001,6(2-3):223-255
The protein structure prediction problem is one of the most (if not the most) important problem in computational biology. This problem consists of finding the conformation of a protein with minimal energy. Because of the complexity of this problem, simplified models like Dill's HP-lattice model [15], [16] have become a major tool for investigating general properties of protein folding. Even for this simplified model, the structure prediction problem has been shown to be NP-complete [5], [7]. We describe a constraint formulation of the HP-model structure prediction problem, and present the basic constraints and search strategy. Of course, the simple formulation would not lead to an efficient algorithm. We therefore describe redundant constraints to prune the search tree. Furthermore, we need bounding function for the energy of an HP-protein. We introduce a new lower bound based on partial knowledge about the final conformation (namely the distribution of H-monomers to layers).  相似文献   

2.
后基因组时代的到来,蛋白质的数据量急剧增长.为对蛋白质结构进行准确预测,提出了一种深度学习的方法,来预测蛋白质的二级结构分类问题.采用由近似熵、疏水模式以及图像特征组成的伪氨基酸组分方法,来提取蛋白质序列的特征;预测模型采用了5层的深度玻尔兹曼机(DBM)+分类层,5层的DBM组成了4个RBM,分类层采用softmax分类器;同时采用了非监督学习和监督学习作为预测模型的训练策略.与现有预测方法相比,提出的预测方法,比目前较好的支持向量机(SVM),人工神经网络(ANN)精度均要高.实验结果表明,提出的改进方法具有很好的可行性和有效性.  相似文献   

3.
The explosive accumulation of protein sequences in the wake of large-scale sequencing projects is in shark contrast to the much slower experimental determination of protein structures. Neural Networks have been successfully applied into the prediction of protein structures, and the prediction accuracy continues to rise. This paper introduces the basic methods and technologies of the prediction of protein secondary structures using neural networks, especially expounds the two aspects., the improvement of neural network architecture and the adding of“evolutionary” information, which lead the ascent of prediction accuracy.  相似文献   

4.
预测蛋白质二级结构,是当今生物信息学中一个难以解决的问题。由于预测蛋白质二级结构的精度在蛋白 质结构研究中起到非常重要的作用,因此在基于KDTICM理论基础上,提出一种基于混合SVM方法的蛋白质二级 结构预测算法。该算法有效地利用蛋白质的物化属性和PSI-SEARCH生成的位置特异性打分矩阵作为双层SVM的 输入,从而大大地提高了蛋白质二级结构预测的精度。实验比较分析表明,新算法的预测精度和普适性明显优于目前 其他典型的预测方法。  相似文献   

5.
吕志鹏  黄文奇 《计算机科学》2005,32(11):148-149
蛋白质结构预测问题是计算生物学领域的核心问题之一。通过理论计算的方法根据蛋白质氨基酸序列直接预测其空间结构是解决这一问题的有效途径。构造了新的邻域结构,采用了部分随机跳坑策略,对此问题提出了新的局部搜索算法。计算结果表明,该算法计算效率要优于传统的遗传算法和Monte Carlo方法。对于链长为50的算例还找到了文献中所没有的全新的最低能量构形。  相似文献   

6.
从头预测是蛋白质结构建模的一种重要方法,该方法的研究有助于人类理解蛋白质功能,从而进行药物设计和疾病治疗。为了提高预测精度,文中提出了基于接触图残基对距离约束的蛋白质结构预测算法(CDPSP)。基于进化算法框架,CDPSP将构象空间采样分为探索和增强两个阶段。在探索阶段,设计基于残基对距离的变异与选择策略,即根据接触图的接触概率选择残基对,并通过片段组装技术对所选择的残基对的邻近区域进行变异;将残基对距离离散化为多个区域并为其分配期望概率,根据期望概率确定是否选择变异的构象,从而增加种群的多样性。在增强阶段,利用基于接触图信息的评分指标,结合能量函数,衡量构象的质量,从而选择较优的构象,达到增强CDPSP近天然态区域采样能力的效果。为了验证所提算法的性能,通过CASP12中的10个FM组目标蛋白质对其进行了测试,并将其与一些先进算法进行比较。实验结果表明,CDPSP可以预测得到精度较高的蛋白质三维结构模型。  相似文献   

7.
蛋白质结构预测方法的研究进展   总被引:11,自引:0,他引:11  
目前,在蛋白质结构预测方面,人们努力发展新的方法,该文主要介绍了蛋白质结构预测的方法和进展。详细地综述了几种方法,并简单地介绍了蛋白质结构预测的几个不同阶段,并提出了在蛋白质结构预测方面存在的一些困难。  相似文献   

8.
Protein Structure from Contact Maps: A Case-Based Reasoning Approach   总被引:1,自引:0,他引:1  
Determining the three-dimensional structure of a protein is an important step in understanding biological function. Despite advances in experimental methods (crystallography and NMR) and protein structure prediction techniques, the gap between the number of known protein sequences and determined structures continues to grow. Approaches to protein structure prediction vary from those that apply physical principles to those that consider known amino acid sequences and previously determined protein structures. In this paper we consider a two-step approach to structure prediction: (1) predict contacts between amino acids using sequence data; (2) predict protein structure using the predicted contact maps. Our focus is on the second step of this approach. In particular, we apply a case-based reasoning framework to determine the alignment of secondary structures based on previous experiences stored in a case base, along with detailed knowledge of the chemical and physical properties of proteins. Case-based reasoning is founded on the premise that similar problems have similar solutions. Our hypothesis is that we can use previously determined structures and their contact maps to predict the structure for novel proteins from their contact maps. The paper presents an overview of contact maps along with the general principles behind our methodology of case-based reasoning. We discuss details of the implementation of our system and present empirical results using contact maps retrieved from the Protein Data Bank. Funding provided by: The Natural Science and Engineering Research Council (Ottawa); Institute for Robotics and Intelligent Systems (Ottawa); Protein Engineering Network Center of Excellence (Edmonton)  相似文献   

9.
针对蛋白质高维构象空间搜索问题,提出一种基于副本交换的局部增强差分进化蛋白质结构从头预测方法(RLDE)。首先,采用基于知识的Rosetta粗粒度能量模型显著降低构象空间优化变量维数;其次,引入基于片段库知识的片段组装技术进一步减小构象搜索空间,有效避免搜索过程中的熵效应;此外,在每个副本层设置构象种群,采用差分进化算法对种群进行更新,然后利用Monte Carlo算法对种群做局部增强,以此得到全局和部分局部最优构象。综上,RLDE利用差分进化算法较强的全局搜索能力可以对构象空间进行有效的全局搜索;借助Monte Carlo算法局部搜索性能对构象空间局部极小区域进行更为充分的采样;副本交换策略保证了副本层中种群的多样性,同时能够增强算法跳出局部极小的能力,从而使得算法对构象空间的搜索能力进一步增强。15个目标蛋白测试结果表明,所提方法能够有效地对构象空间采样,得到高精度的近天然态蛋白质构象。  相似文献   

10.
在7个数据集上对3种不同聚类算法与3种不同相似性度最标准的多种组合进行实验,以评估这些因素对聚类性能的影响.为便于确定聚类参数,提出一种针对蛋白质结构预测的聚类中心选择算法.实验结果表明,在3种相似性度量标准中,RMSD对于聚类的效果最好,而在3种聚类算法中,SPICKER性能最优,其次是AP聚类算法.  相似文献   

11.
Intrinsically disordered regions in proteins are relatively frequent and important for our understanding of molecular recognition and assembly, and protein structure and function. From an algorithmic standpoint, flagging large disordered regions is also important for ab initio protein structure prediction methods. Here we first extract a curated, non-redundant, data set of protein disordered regions from the Protein Data Bank and compute relevant statistics on the length and location of these regions. We then develop an ab initio predictor of disordered regions called DISpro which uses evolutionary information in the form of profiles, predicted secondary structure and relative solvent accessibility, and ensembles of 1D-recursive neural networks. DISpro is trained and cross validated using the curated data set. The experimental results show that DISpro achieves an accuracy of 92.8% with a false positive rate of 5%. DISpro is a member of the SCRATCH suite of protein data mining tools available through  相似文献   

12.
选取合适的蛋白质结构预测算法的性能评估指标,是直接影响到衡量和比较各种蛋白质结构预测算法优劣的重要问题。本文对目前各种评估指标进行了剖析比较,总结对比了各种评估指标的优缺点,分析了其相互之间的联系与区别,并结合神经网络建模,提出各种评估指标的适用范围与使用原则。  相似文献   

13.
基于级联神经网络的蛋白质二级结构预测   总被引:3,自引:1,他引:3       下载免费PDF全文
为提高蛋白质二级结构预测的精度,提出一种由两层网络构成的级联神经网络模型。第1层网络采用具有差异度的5个子网构成的网络模型,对第2层网络的输入编码进行改进。对PDBSelect25中的36条蛋白质共6 122个残基进行测试,结果表明,该模型能有效预测蛋白质二级结构,其预测精度分别比SNN, DSC, PREDSATOR方法提高5.31%, 1.21%和0.92%,平均预测精度提高到69.61%。  相似文献   

14.
Bioinformatics aims at applying computer science methods to the wealth of data collected in a variety of experiments in life sciences (e.g. cell and molecular biology, biochemistry, medicine, etc.) in order to help analysing such data and eliciting new knowledge from it. In addition to string processing bioinformatics is often identified with machine learning used for mining the large banks of bio-data available in electronic format, namely in a number of web servers. Nevertheless, there are opportunities of applying other computational techniques in some bioinformatics applications. In this paper, we report the application of constraint programming to address two structural bioinformatics problems, protein structure prediction and protein interaction (docking). The efficient application of constraint programming requires innovative modelling of these problems, as well as the development of advanced propagation techniques (e.g. global reasoning and propagation), which were adopted in Chemera, a system that is currently used to support biochemists in their research.  相似文献   

15.
Model-based diagnosis, and constraint-based reasoning are well known generic paradigms for which the most difficult task lies in the construction of the models used. We consider the problem of localizing and correcting the errors in a model. We present a method to debug a model. To help the debugging task, we propose to use the model-base diagnosis solver. This method has been used in a real application of the development a model of a railway signalling system.  相似文献   

16.
蛋白质三维结构决定了其特殊的生物功能,蛋白质三维结构对蛋白质功能研究、疾病的诊断与治疗、创新药物研发都有着重要的科学意义。利用计算机技术从氨基酸序列预测蛋白质三维结构是获取蛋白质三维结构的有效方法。片段组装是一种广泛采用的蛋白质结构预测技术,它将连续的构象空间优化问题转换成离散的实验片段组合优化问题,从而有效地减小了构象搜索空间。首先介绍了片段组装技术;其次总结了基于片段组装的蛋白质结构预测的发展历程,并对部分具有代表性的方法进行了简要阐述;然后介绍了蛋白质结构预测研究中常用的数据库和评价指标,并比较了不同预测方法的性能;最后分析并指出了当前基于片段组装的蛋白质结构预测方法所存在的挑战性问题,并对该领域未来的研究方向进行了展望。  相似文献   

17.
We derive fast algorithms for the following problem: given a set of n points on the real line and two parameters s and p, find s disjoint intervals of maximum total length that contain at most p of the given points. Our main contribution consists of algorithms whose time bounds improve upon a straightforward dynamic programming algorithm, in the relevant case that input size n is much bigger than parameters s and p. These results are achieved by selecting a few candidate intervals that are provably sufficient for building an optimal solution via dynamic programming. As a byproduct of this idea we improve an algorithm for a similar subsequence problem of Chen et al. [Disjoint segments with maximum density, in: International Workshop on Bioinformatics Research and Applications IWBRA 2005, (within ICCS 2005), Lecture Notes in Computer Science, vol. 3515, Springer, Berlin, pp. 845–850]. The problems are motivated by the search for significant patterns in biological data. Finally, we propose several heuristics that further reduce the time complexity in typical instances. One of them leads to an apparently open subsequence sum problem of independent interest.  相似文献   

18.
一种氨基酸序列只可能有一种蛋白质结构,所以在蛋白质理论预测中,正确定义能量函数、精确选用的计算机搜寻算法来寻找能量最低值,是蛋白质结构预测的关键。基于此,本文以两两残基之间距离分布和二面角分布符合玻尔兹曼定理,提出了一种抽象的蛋白质三维结构连续物理数学模型。然后应用了禁忌搜索算法很好的计算了牛胰岛素B(D)主链走向;比较计算了氨基酸序列最低能量的全局最优点。  相似文献   

19.
从氨基酸序列来预测蛋白质二级结构,是我们理解蛋白质结构和功能的重要一步。本文探讨了基于Spiking神经网络的蛋白质二级结构学习预测模型,利用单个神经网络进行学习取得的效果不明显,而利用级联神经网络,通过结构到结构的学习,能很好地提高学习准确率。  相似文献   

20.
从氨基酸序列来预测蛋白质二级结构,是我们理解蛋白质结构和功能的重要一步.本文探讨了基于Spiking神经网络的蛋白质二级结构学习预测模型,利用单个神经网络进行学习取得的效果不明显,而利用级联神经网络,通过结构到结构的学习,能很好地提高学习准确率.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号