首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 125 毫秒
1.
认识和预测蛋白质天然构象的波动对蛋白质-蛋白质对接和设计等应用是非常重要的.但是许多骨架柔性的方法会导致骨架较大幅度的波动.Backrub模型能够对骨架进行微小的扰动,符合高分辨率晶体结构中观察到的构象的微妙变化.本文提出了一种基于Backrub的并行扰动骨架和侧链的模型,可以对天然构象的等价状态进行模拟.这种并行扰动方式更加接近于真实情况下蛋白质构象的运动方式,更好地模拟了实验数据.通过预测10个点突变实例,相比串行随机扰动模型产生的构象,并行模型不仅从时间上提高了产生构象的速度,更提高了侧链的预测精度.  相似文献   

2.
片段组装方法是从头预测蛋白质三维结构的一类重要方法.现有的基于序列相似的片段库质量限制了低同源目标的预测精度,所以寻找与天然结构更加拟合的已知蛋白质结构片段来构建高质量的片段库是片段组装方法的一项重要任务.本文利用SCOP数据库中的三维结构相似性,对SCOP的折叠模式进行预测,提取预测出的相同折叠模式的已知蛋白质结构的信息,生成保存残基信息的数据库(Vall库).然后将目标蛋白质序列分割成的残基片段与Vall库进行综合评价后生成一种新的片段库,该片段库可以用于一个骨架预测并行蚁群算法.将本文方法与蛋白质结构预测程序RosettaAbinitio的基于序列的片段库进行了比较,实验结果表明采用本文方法的片段库可以找到更接近天然构象的蛋白质结构.  相似文献   

3.
基于微扰突变的蛋白质侧链安装的遗传算法   总被引:3,自引:0,他引:3  
在侧链转子库的基础上,我们提出了用于蛋白质侧链安装的微扰突变的遗传算法方法。它在常规单点突变的同时又采用了微扰突变的方法,因而同时具有传统遗传算法的和理论搜索模拟方法的特点。我们分别用均方根偏差能量函数和起初的能量函数灵敏计算了单个蛋白质和蛋白与蛋白复合物结合区域的侧链构象。结果表明该方法优于传统的遗传算法,能精确地物理蛋白质侧链的构象。  相似文献   

4.
特征向量的构造是蛋白质二级结构预测的一个关键问题. 现有的研究方法,通常只使用BLOSUM62进化矩阵生成PSSM矩阵,对蛋白质进化过程中存在的氨基酸残基突变现象缺乏考虑. 本文提出利用多重进化矩阵构造蛋白质特征向量,其融合了不同进化时间的PSSM矩阵,不仅能够很好地反映序列中氨基酸的位置信息,而且能够反映序列进化过程中氨基酸位点发生突变产生的影响. 本文通过组合不同进化程度的矩阵来构造特征向量,选用逻辑回归、随机森林和多分类支持向量机三种分类算法作为预测工具,利用网格搜索法和交叉实验法优化参数,在RS126、CB513和25PDB公用数据集上进行了若干组实验. 对比实验结果表明,本文所提出基于多重进化矩阵的蛋白质特征向量构造方法能够有效提高蛋白质二级结构的预测精度.  相似文献   

5.
在蛋白质空间结构预测中,二硫键的确定可以大大减少蛋白质构象的搜索空间。为提高二硫键预测的准确率,对形成二硫键的半胱氨酸及其周围的氨基酸残基在蛋白质二级结构形成上的偏性进行了分析,并提出将蛋白质二级结构信息加入到BP神经网络预测模型的输入编码信息中。研究对象为从SWISS-PROT数据库中选取的252条蛋白质序列,随机均分4组,对预测准确率进行4-交叉验证。各项准确率均比未加入蛋白质二级结构信息前,有明显提高。结果表明,结合蛋白质二级结构信息的编码方式是可行且有效的。  相似文献   

6.
蛋白质功能的准确预测有利于推进生物医学发展,高通量测序技术的快速发展加快了蛋白质序列的提取速度,从而产生了大量未注释的蛋白质,并且新测序序列缺乏结构等生物信息,针对该问题提出了基于序列和组合图卷积网络的蛋白质功能预测模型(Protein Function Prediction using Sequences and Combined Graph Convolutional Networks, PFP-SCGCN).首先通过深度学习方法捕获蛋白质序列的多维特征信息,再通过多序列比对从蛋白质序列中提取进化耦合信息和氨基酸残基群落,然后利用进化耦合信息和氨基酸残基群落生成序列氨基酸之间两种不同连接程度的邻接矩阵,将这两种邻接矩阵与序列特征信息一起输入给组合图卷积网络进行信息融合,最后通过多个全连接层获得蛋白质功能类别信息.本文还通过分析PFP-SCGCN的特定网络层识别蛋白质功能位点,可帮助人们推测出新序列中的重要氨基酸.模型结果表明,PFP-SCGCN模型的功能预测准确率远高于对比方法,具有较好的鲁棒性,并且可以较准确的识别功能位点.  相似文献   

7.
蛋白质三维结构决定了其特殊的生物功能,蛋白质三维结构对蛋白质功能研究、疾病的诊断与治疗、创新药物研发都有着重要的科学意义。利用计算机技术从氨基酸序列预测蛋白质三维结构是获取蛋白质三维结构的有效方法。片段组装是一种广泛采用的蛋白质结构预测技术,它将连续的构象空间优化问题转换成离散的实验片段组合优化问题,从而有效地减小了构象搜索空间。首先介绍了片段组装技术;其次总结了基于片段组装的蛋白质结构预测的发展历程,并对部分具有代表性的方法进行了简要阐述;然后介绍了蛋白质结构预测研究中常用的数据库和评价指标,并比较了不同预测方法的性能;最后分析并指出了当前基于片段组装的蛋白质结构预测方法所存在的挑战性问题,并对该领域未来的研究方向进行了展望。  相似文献   

8.
从头预测是蛋白质结构建模的一种重要方法,该方法的研究有助于人类理解蛋白质功能,从而进行药物设计和疾病治疗。为了提高预测精度,文中提出了基于接触图残基对距离约束的蛋白质结构预测算法(CDPSP)。基于进化算法框架,CDPSP将构象空间采样分为探索和增强两个阶段。在探索阶段,设计基于残基对距离的变异与选择策略,即根据接触图的接触概率选择残基对,并通过片段组装技术对所选择的残基对的邻近区域进行变异;将残基对距离离散化为多个区域并为其分配期望概率,根据期望概率确定是否选择变异的构象,从而增加种群的多样性。在增强阶段,利用基于接触图信息的评分指标,结合能量函数,衡量构象的质量,从而选择较优的构象,达到增强CDPSP近天然态区域采样能力的效果。为了验证所提算法的性能,通过CASP12中的10个FM组目标蛋白质对其进行了测试,并将其与一些先进算法进行比较。实验结果表明,CDPSP可以预测得到精度较高的蛋白质三维结构模型。  相似文献   

9.
蛋白质折叠问题就是从氨基酸序列中预测蛋白质的构象,该问题是生物信息学的一个突出问题。主要研究二维HP格点模型,它是用于模拟蛋白质折叠问题的一个具有代表性的简化模型,并且将蚁群算法用于求解该二维HP蛋白质的折叠问题。此外,在局部搜索机制中引入一种改进的牵引移动方法,这是一个提高蛋白质构象的有效方法。实验结果表明,针对较长的氨基酸序列,改进的带牵引移动的蚁群算法(ACO+)比ACO能够获得更低能量的构象,证明了所提出的改进蚁群算法是预测蛋白质结构的有效方法。  相似文献   

10.
鉴于不同类型氨基酸的相互作用对蛋白质结构预测的影响不同,文中融合卷积神经网络和长短时记忆神经网络模型,提出卷积长短时记忆神经网络,并应用到蛋白质8类二级结构的预测中.首先基于氨基酸序列的类别信息和氨基酸结构的进化信息表示蛋白质序列,并采用卷积提取氨基酸残基之间的局部相关特征,然后利用双向长短时记忆神经网络提取蛋白质序列内部残基之间的远程相互作用,最后将提取的蛋白质的局部相关特征和远程相互作用用于蛋白质8类二级结构的预测.实验表明,相比基准方法,文中模型提高8类二级结构预测的精度,并具有良好的可扩展性.  相似文献   

11.
Protein structure prediction is currently one of the main open challenges in Bioinformatics. The protein contact map is an useful, and commonly used, representation for protein 3D structure and represents binary proximities (contact or non-contact) between each pair of amino acids of a protein. In this work, we propose a multi-objective evolutionary approach for contact map prediction based on physico-chemical properties of amino acids. The evolutionary algorithm produces a set of decision rules that identifies contacts between amino acids. The rules obtained by the algorithm impose a set of conditions based on amino acid properties to predict contacts. We present results obtained by our approach on four different protein data sets. A statistical study was also performed to extract valid conclusions from the set of prediction rules generated by our algorithm. Results obtained confirm the validity of our proposal.  相似文献   

12.
Precise prediction of protein secondary structures from the associated amino acids sequence is of great importance in bioinformatics and yet a challenging task for machine learning algorithms. As a major step toward predicting the ultimate three dimensional structures, the secondary structure assignment specifies the protein function. Considering a multilayer perceptron neural network, pruned for optimum size of hidden layers, as the reference network, advanced kinds of recurrent neural network (RNN) are devised in this article to enhance the secondary structure prediction. To better model the strong correlations between secondary structure elements, types of modular reciprocal recurrent neural networks (MRR-NN) are examined. Additionally, to take into account the long-range interactions between amino acids in formation of the secondary structure, bidirectional RNN are investigated. A multilayer bidirectional recurrent neural network (MBR-NN) is finally applied to capture the predominant long-term dependencies. Eventually, a modular prediction system based on the interactive combination of the MRR-NN and MBR-NN boosts the percentage accuracy (Q3) up to 76.91% and augments the segment overlap (SOV) up to 68.13% when tested on the PSIPRED dataset. The coupling effects of the secondary structure types as well as the sequential information of amino acids along the protein chain can be well cast by the integration of the MRR-NN and the MBR-NN.  相似文献   

13.
Recognition of protein folding patterns is an important step in protein structure and function predictions. Traditional sequence similarity-based approach fails to yield convincing predictions when proteins have low sequence identities, while the taxonometric approach is a reliable alternative. From a pattern recognition perspective, protein fold recognition involves a large number of classes with only a small number of training samples, and multiple heterogeneous feature groups derived from different propensities of amino acids. This raises the need for a classification method that is able to handle the data complexity with a high prediction accuracy for practical applications. To this end, a novel ensemble classifier, called MarFold, is proposed in this paper which combines three margin-based classifiers for protein fold recognition.The effectiveness of our method is demonstrated with the benchmark D-B dataset with 27 classes. The overall prediction accuracy obtained by MarFold is 71.7%, which surpasses the existing fold recognition methods by 3.1–15.7%. Moreover, one component classifier for MarFold, called ALH, has obtained a prediction accuracy of 65.5%, which is 4.7–9.5% higher than the prediction accuracies for the published methods using single classifiers. Additionally, the feature set of pairwise frequency information about the amino acids, which is adopted by MarFold, is found to be important for discriminating folding patterns. These results imply that the MarFold method and its operation engine ALH might become useful vehicles for protein fold recognition, as well as other bioinformatics tasks. The MarFold method and the datasets can be obtained from: (http://www-staff.it.uts.edu.au/~lbcao/publication/MarFold.7z).  相似文献   

14.
元胞自动机图的蛋白质二级结构类型预测   总被引:1,自引:0,他引:1       下载免费PDF全文
蛋白质结构预测是后基因组时代的一项重要任务,蛋白质二级结构预测是蛋白质结构预测的关键步骤。利用氨基酸数字编码模型生成蛋白质序列的元胞自动机图(Cellular Automata Image,CAI),提出了一种基于灰度共生矩阵(Gray Level Co-occurrence Matrix,GLCM)提取纹理图像特征的方法。用扩大的协方差算法进行预测,仿真结果显示有较好的分类效果,Jackknife检验的预测成功率达到94.61%。  相似文献   

15.
Varela  Daniel  Santos  José 《Natural computing》2019,18(2):275-284

This paper proposes to model protein folding as an emergent process, using machine learning to infer the folding modeling only from information of known protein structures. Using the face-centered cubic lattice for protein conformation representation, the dynamic nature of protein folding is captured with an evolved neural cellular automaton that defines the amino acids moves along the protein chain and across time. The results of the final folded conformations are compared, using different protein benchmarks, with other methods used in the traditional protein structure prediction problem, highlighting the capabilities and problems found with this modeling.

  相似文献   

16.
多聚脯氨酸二型螺旋是一种特殊且稀少的蛋白质二级结构。为了节省实验方法测定该结构的时间和成本,本文设计一种基于卷积神经网络的深度学习算法用于预测多聚脯氨酸二型螺旋。首先,对蛋白质序列信息进行特征编码生成特征矩阵,特征编码方式包括氨基酸正交码、氨基酸物理化学性质和位置特异性打分矩阵。其次,将归一化处理后的特征矩阵输入到卷积神经网络中,自动提取蛋白质序列的局部深层特征并输出多聚脯氨酸二型螺旋的预测结果。实验结果表明,该算法的性能相较于支持向量机之类的6种传统机器学习算法有明显的提升。  相似文献   

17.
Protein structure prediction (PSP) is an open problem with many useful applications in disciplines such as medicine, biology and biochemistry. As this problem presents a vast search space and the analysis of each protein structure requires a significant amount of computing time, it is necessary to take advantage of high-performance parallel computing platforms as well as to define efficient search procedures in the space of possible protein conformations. In this paper we compare two parallel procedures for PSP which are based on different multi-objective optimization approaches, i.e. PAES (Knowles and Corne in Proc. Congr. Evol. Comput. 1:98–105, 1999) and NSGA2 (Deb et al. in IEEE Trans. Evol. Comput. 6:182–197, 2002). Although both procedures include techniques to take advantage of known protein structures and strategies to simplify the search space through the so-called rotamer library and adaptive mutation operators, they present different profiles with respect to their implicit parallelism.  相似文献   

18.
J. Atkins  W. E. Hart 《Algorithmica》1999,25(2-3):279-294
We describe a proof of NP-hardness for a lattice protein folding model whose instances contain protein sequences defined with a fixed, finite alphabet that contains 12 amino acid types. This lattice model represents a protein's conformation as a self-avoiding path that is embedded on the three-dimensional cubic lattice. A contact potential is used to determine the energy of a sequence in a given conformation; a pair of amino acids contributes to the conformational energy only if they are adjacent on the lattice. This result overcomes a significant weakness of previous intractability results, which do not examine protein folding models that have a finite alphabet of amino acids together with physically interesting conformations. Received June 1, 1997; revised March 13, 1998.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号