共查询到20条相似文献,搜索用时 15 毫秒
1.
The Protein Structure Prediction Problem: A Constraint Optimization Approach using a New Lower Bound 总被引:1,自引:0,他引:1
Rolf Backofen 《Constraints》2001,6(2-3):223-255
The protein structure prediction problem is one of the most (if not the most) important problem in computational biology. This problem consists of finding the conformation of a protein with minimal energy. Because of the complexity of this problem, simplified models like Dill's HP-lattice model [15], [16] have become a major tool for investigating general properties of protein folding. Even for this simplified model, the structure prediction problem has been shown to be NP-complete [5], [7]. We describe a constraint formulation of the HP-model structure prediction problem, and present the basic constraints and search strategy. Of course, the simple formulation would not lead to an efficient algorithm. We therefore describe redundant constraints to prune the search tree. Furthermore, we need bounding function for the energy of an HP-protein. We introduce a new lower bound based on partial knowledge about the final conformation (namely the distribution of H-monomers to layers). 相似文献
2.
后基因组时代的到来,蛋白质的数据量急剧增长.为对蛋白质结构进行准确预测,提出了一种深度学习的方法,来预测蛋白质的二级结构分类问题.采用由近似熵、疏水模式以及图像特征组成的伪氨基酸组分方法,来提取蛋白质序列的特征;预测模型采用了5层的深度玻尔兹曼机(DBM)+分类层,5层的DBM组成了4个RBM,分类层采用softmax分类器;同时采用了非监督学习和监督学习作为预测模型的训练策略.与现有预测方法相比,提出的预测方法,比目前较好的支持向量机(SVM),人工神经网络(ANN)精度均要高.实验结果表明,提出的改进方法具有很好的可行性和有效性. 相似文献
3.
The explosive accumulation of protein sequences in the wake of large-scale sequencing projects is in shark contrast to the much slower experimental determination of protein structures. Neural Networks have been successfully applied into the prediction of protein structures, and the prediction accuracy continues to rise. This paper introduces the basic methods and technologies of the prediction of protein secondary structures using neural networks, especially expounds the two aspects., the improvement of neural network architecture and the adding of“evolutionary” information, which lead the ascent of prediction accuracy. 相似文献
4.
5.
蛋白质结构预测问题是计算生物学领域的核心问题之一。通过理论计算的方法根据蛋白质氨基酸序列直接预测其空间结构是解决这一问题的有效途径。构造了新的邻域结构,采用了部分随机跳坑策略,对此问题提出了新的局部搜索算法。计算结果表明,该算法计算效率要优于传统的遗传算法和Monte Carlo方法。对于链长为50的算例还找到了文献中所没有的全新的最低能量构形。 相似文献
6.
从头预测是蛋白质结构建模的一种重要方法,该方法的研究有助于人类理解蛋白质功能,从而进行药物设计和疾病治疗。为了提高预测精度,文中提出了基于接触图残基对距离约束的蛋白质结构预测算法(CDPSP)。基于进化算法框架,CDPSP将构象空间采样分为探索和增强两个阶段。在探索阶段,设计基于残基对距离的变异与选择策略,即根据接触图的接触概率选择残基对,并通过片段组装技术对所选择的残基对的邻近区域进行变异;将残基对距离离散化为多个区域并为其分配期望概率,根据期望概率确定是否选择变异的构象,从而增加种群的多样性。在增强阶段,利用基于接触图信息的评分指标,结合能量函数,衡量构象的质量,从而选择较优的构象,达到增强CDPSP近天然态区域采样能力的效果。为了验证所提算法的性能,通过CASP12中的10个FM组目标蛋白质对其进行了测试,并将其与一些先进算法进行比较。实验结果表明,CDPSP可以预测得到精度较高的蛋白质三维结构模型。 相似文献
7.
蛋白质结构预测方法的研究进展 总被引:11,自引:0,他引:11
殷志祥 《计算机工程与应用》2004,40(20):54-57
目前,在蛋白质结构预测方面,人们努力发展新的方法,该文主要介绍了蛋白质结构预测的方法和进展。详细地综述了几种方法,并简单地介绍了蛋白质结构预测的几个不同阶段,并提出了在蛋白质结构预测方面存在的一些困难。 相似文献
8.
Determining the three-dimensional structure of a protein is an important step in understanding biological function. Despite
advances in experimental methods (crystallography and NMR) and protein structure prediction techniques, the gap between the
number of known protein sequences and determined structures continues to grow.
Approaches to protein structure prediction vary from those that apply physical principles to those that consider known amino
acid sequences and previously determined protein structures. In this paper we consider a two-step approach to structure prediction:
(1) predict contacts between amino acids using sequence data; (2) predict protein structure using the predicted contact maps.
Our focus is on the second step of this approach. In particular, we apply a case-based reasoning framework to determine the
alignment of secondary structures based on previous experiences stored in a case base, along with detailed knowledge of the
chemical and physical properties of proteins. Case-based reasoning is founded on the premise that similar problems have similar
solutions. Our hypothesis is that we can use previously determined structures and their contact maps to predict the structure
for novel proteins from their contact maps.
The paper presents an overview of contact maps along with the general principles behind our methodology of case-based reasoning.
We discuss details of the implementation of our system and present empirical results using contact maps retrieved from the
Protein Data Bank.
Funding provided by: The Natural Science and Engineering Research Council (Ottawa); Institute for Robotics and Intelligent
Systems (Ottawa); Protein Engineering Network Center of Excellence (Edmonton) 相似文献
9.
针对蛋白质高维构象空间搜索问题,提出一种基于副本交换的局部增强差分进化蛋白质结构从头预测方法(RLDE)。首先,采用基于知识的Rosetta粗粒度能量模型显著降低构象空间优化变量维数;其次,引入基于片段库知识的片段组装技术进一步减小构象搜索空间,有效避免搜索过程中的熵效应;此外,在每个副本层设置构象种群,采用差分进化算法对种群进行更新,然后利用Monte Carlo算法对种群做局部增强,以此得到全局和部分局部最优构象。综上,RLDE利用差分进化算法较强的全局搜索能力可以对构象空间进行有效的全局搜索;借助Monte Carlo算法局部搜索性能对构象空间局部极小区域进行更为充分的采样;副本交换策略保证了副本层中种群的多样性,同时能够增强算法跳出局部极小的能力,从而使得算法对构象空间的搜索能力进一步增强。15个目标蛋白测试结果表明,所提方法能够有效地对构象空间采样,得到高精度的近天然态蛋白质构象。 相似文献
10.
11.
Intrinsically disordered regions in proteins are relatively frequent and important for our understanding of molecular recognition
and assembly, and protein structure and function. From an algorithmic standpoint, flagging large disordered regions is also
important for ab initio protein structure prediction methods. Here we first extract a curated, non-redundant, data set of protein disordered regions
from the Protein Data Bank and compute relevant statistics on the length and location of these regions. We then develop an
ab initio predictor of disordered regions called DISpro which uses evolutionary information in the form of profiles, predicted secondary
structure and relative solvent accessibility, and ensembles of 1D-recursive neural networks. DISpro is trained and cross validated
using the curated data set. The experimental results show that DISpro achieves an accuracy of 92.8% with a false positive
rate of 5%. DISpro is a member of the SCRATCH suite of protein data mining tools available through 相似文献
12.
选取合适的蛋白质结构预测算法的性能评估指标,是直接影响到衡量和比较各种蛋白质结构预测算法优劣的重要问题。本文对目前各种评估指标进行了剖析比较,总结对比了各种评估指标的优缺点,分析了其相互之间的联系与区别,并结合神经网络建模,提出各种评估指标的适用范围与使用原则。 相似文献
13.
14.
Bioinformatics aims at applying computer science methods to the wealth of data collected in a variety of experiments in life
sciences (e.g. cell and molecular biology, biochemistry, medicine, etc.) in order to help analysing such data and eliciting
new knowledge from it. In addition to string processing bioinformatics is often identified with machine learning used for
mining the large banks of bio-data available in electronic format, namely in a number of web servers. Nevertheless, there
are opportunities of applying other computational techniques in some bioinformatics applications. In this paper, we report
the application of constraint programming to address two structural bioinformatics problems, protein structure prediction
and protein interaction (docking). The efficient application of constraint programming requires innovative modelling of these
problems, as well as the development of advanced propagation techniques (e.g. global reasoning and propagation), which were
adopted in Chemera, a system that is currently used to support biochemists in their research. 相似文献
15.
Model-based diagnosis, and constraint-based reasoning are well known generic paradigms for which the most difficult task lies
in the construction of the models used. We consider the problem of localizing and correcting the errors in a model. We present
a method to debug a model. To help the debugging task, we propose to use the model-base diagnosis solver. This method has
been used in a real application of the development a model of a railway signalling system. 相似文献
16.
蛋白质三维结构决定了其特殊的生物功能,蛋白质三维结构对蛋白质功能研究、疾病的诊断与治疗、创新药物研发都有着重要的科学意义。利用计算机技术从氨基酸序列预测蛋白质三维结构是获取蛋白质三维结构的有效方法。片段组装是一种广泛采用的蛋白质结构预测技术,它将连续的构象空间优化问题转换成离散的实验片段组合优化问题,从而有效地减小了构象搜索空间。首先介绍了片段组装技术;其次总结了基于片段组装的蛋白质结构预测的发展历程,并对部分具有代表性的方法进行了简要阐述;然后介绍了蛋白质结构预测研究中常用的数据库和评价指标,并比较了不同预测方法的性能;最后分析并指出了当前基于片段组装的蛋白质结构预测方法所存在的挑战性问题,并对该领域未来的研究方向进行了展望。 相似文献
17.
We derive fast algorithms for the following problem: given a set of n points on the real line and two parameters s and p, find s disjoint intervals of maximum total length that contain at most p of the given points. Our main contribution consists of algorithms whose time bounds improve upon a straightforward dynamic programming algorithm, in the relevant case that input size n is much bigger than parameters s and p. These results are achieved by selecting a few candidate intervals that are provably sufficient for building an optimal solution via dynamic programming. As a byproduct of this idea we improve an algorithm for a similar subsequence problem of Chen et al. [Disjoint segments with maximum density, in: International Workshop on Bioinformatics Research and Applications IWBRA 2005, (within ICCS 2005), Lecture Notes in Computer Science, vol. 3515, Springer, Berlin, pp. 845–850]. The problems are motivated by the search for significant patterns in biological data. Finally, we propose several heuristics that further reduce the time complexity in typical instances. One of them leads to an apparently open subsequence sum problem of independent interest. 相似文献
18.
WANG Jian 《数字社区&智能家居》2008,(27)
一种氨基酸序列只可能有一种蛋白质结构,所以在蛋白质理论预测中,正确定义能量函数、精确选用的计算机搜寻算法来寻找能量最低值,是蛋白质结构预测的关键。基于此,本文以两两残基之间距离分布和二面角分布符合玻尔兹曼定理,提出了一种抽象的蛋白质三维结构连续物理数学模型。然后应用了禁忌搜索算法很好的计算了牛胰岛素B(D)主链走向;比较计算了氨基酸序列最低能量的全局最优点。 相似文献
19.
沈虹 《数字社区&智能家居》2007,(11):683-684
从氨基酸序列来预测蛋白质二级结构,是我们理解蛋白质结构和功能的重要一步。本文探讨了基于Spiking神经网络的蛋白质二级结构学习预测模型,利用单个神经网络进行学习取得的效果不明显,而利用级联神经网络,通过结构到结构的学习,能很好地提高学习准确率。 相似文献
20.
沈虹 《数字社区&智能家居》2007,(21)
从氨基酸序列来预测蛋白质二级结构,是我们理解蛋白质结构和功能的重要一步.本文探讨了基于Spiking神经网络的蛋白质二级结构学习预测模型,利用单个神经网络进行学习取得的效果不明显,而利用级联神经网络,通过结构到结构的学习,能很好地提高学习准确率. 相似文献