首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 140 毫秒
1.
蛋白质相互作用位点研究在蛋白质功能分析及药物设计等方面有着重要的应用。文章以蛋白质中的氨基酸残基为研究对象,使用残基的溶剂可及表面积、进化保守性打分及残基的序列信息熵三个特征为特征集,构建了基于贝叶斯方法的蛋白质相互作用位点预测的贝叶斯分类预测器。方法有效的结合了蛋白质残基特征数据集经常性数据缺失的特点及贝叶斯网在处理不确定性数据方面的优点,通过对基准的71个蛋白质数据集进行实验,结果表明我们的分类器预测的有效性。  相似文献   

2.
蛋白质相互作用中界面残基的识别在药物设计与生物体的新陈代谢等方面有着广泛应用。基于朴素贝叶斯分类器对属性条件独立性的要求,构建了由蛋白质序列谱和溶剂可及表面积组成的蛋白质相互作用特征模型。在一个具有代表性的蛋白质异源复合物组成的数据集中取得了68.1%的准确率、0.201 的相关系数、40.2%的特异度和 49.9%的灵敏度,取得了比其他方法更优的结果,且远优于随机的实验结果。通过一个三维可视化的结果更好地验证了方法的有效性。  相似文献   

3.
一种小规模数据集下的贝叶斯网络学习方法及其应用   总被引:1,自引:1,他引:0  
提出了一种小规模数据集下学习贝叶斯网络的有效算法—FCLBNo FCLBN利用bootstrap方法在给定的小样本数据集上进行重抽样,然后用在抽样后数据集上学到的贝叶斯网络来佑计原数据集上的贝叶斯网络的高置信度的特征,并用这些特征来指导在原数据集上的贝叶斯网络搜索。用标准的数据集验证了FCLBN的有效性,并将FCLBN应用于酵母菌细胞中蛋白质的定位预测。实验结果表明,FCLBN能够在小规模数据集上学到较好的网络模型。  相似文献   

4.
蛋白质与配体相互作用在生命过程中是普遍存在且不可或缺的,这种相互作用在生物分子的识别和信号传递过程中起着非常重要的作用。识别出蛋白质与配体相互作用的绑定残基对蛋白质功能研究、药物设计和筛选都有着重要的科学意义,而生物计算方法是蛋白质与配体绑定残基预测研究中的一种重要手段。本文首先给出了蛋白质与配体相互作用的绑定残基的一般性定义;其次,总结出了一种蛋白质与配体绑定残基预测方法的分类体系,并对其中一些代表性的预测方法进行了简要阐述;再次,给出了蛋白质与配体绑定残基预测研究中常用的数据库和评价指标,并通过在相关数据集上进行实验比较了具有代表性的预测方法的性能;最后,对若干挑战性问题进行分析并预测该领域未来的研究方向,以期对相关研究提供一定的参考。  相似文献   

5.
凋谢蛋白亚细胞定位预测是研究凋谢蛋白生物功能的 1 种重要的方法,也是生物信息学研究的重要领域之一.提高凋谢蛋白亚细胞定位预测模型准确性和实用性是该研究的重点.在本研究中,提出了以模糊 K 近邻分类算法作为基础分类器的集成分类算法.以蛋白质序列内不同间隔的二肽组成表示基本的蛋白质序列的特征集合,采用二进制粒子群算法作为特征选择方法提取能够有效的蛋白质序列特征.这些经过特征选择后的蛋白质序列特征作为集成分类算法中每一个基础分类器的输入向量.经过在2个常用的数据集上使用 Jackknife 测试,本文算法在 C1317 数据集上取得了 91.5% 的预测准确率,在ZW225数据集上取得了88.0%的准确率.与前人报道的算法预测结果比较,本文方法取得了较好的准确率.与使用相同数据集的已经报道凋谢蛋白亚细胞定位预测算法相比,本研究方法取得了预测准确率.  相似文献   

6.
蛋白质相瓦作用位点在细胞进程中有着非常重要的作用.尽管利用高通量方法发现蛋白质相瓦作用位点取得很大的成功,仍需要计算方法辅助预测实验中的相互作用位点.本文提出了基于残基序列谱、进化率和疏水性的预测异源蛋白质复合物作用位点的两种向量表示方法并以支持向量机实现预测.其中,提出新的向量表示法取得更好的预测性能.文中的数据集由66个异源复合物蛋白质链组成.  相似文献   

7.
蛋白质亚细胞定位与其功能密切相关.蛋白质在细胞中的正确定位是细胞系统高度有序运转的前提保障.研究细胞中蛋白质定位的机制和规律,预测蛋白质的亚细胞定位,时于了解蛋白质的性质和功能,了解蛋白质之间的相互作用,探索生命的规律和奥秘具有重要意义.基于机器学习方法的蛋白质亚细胞定位预测是生物信息学研究的热点之一.从数据集的建立、蛋白质序列特征刻画和蛋白质亚细胞定位预测算法3个方面,总结和评述了在过去十几年里机器学习方法在蛋白质亚细胞定位研究中的应用情况和取得的成果,分析了机器学习方法在蛋白质亚细胞定位预测方面存在的问题和面临的挑战,指出了蛋白质亚细胞定位研究的主要方向.  相似文献   

8.
陈松峰  范明 《计算机科学》2010,37(8):236-239256
提出了一种使用基于贝叶斯的基分类器建立组合分类器的新方法PCABoost.本方法在创建训练样本时,随机地将特征集划分成K个子集,使用PCA得到每个子集的主成分,形成新的特征空间,并将全部的训练数据映射到新的特征空间作为新的训练集.通过不同的变换生成不同的特征空间,从而产生若干个有差异的训练集.在每一个新的训练集上利用AdaBoost建立一组基于贝叶斯的逐渐提升的分类器(即一个分类器组),这样就建立了若干个有差异的分类器组,然后在每个分类器组内部通过加权投票产生一个预测,再把每个组的预测通过投票来产生组合分类器的分类结果,最终建立一个具有两层组合的组合分类器.从UCI标准数据集中随机选取30个数据集进行实验.结果表明,本算法不仅能够显著提高基于贝叶斯的分类器的分类性能,而且与Rotation Forest和AdaBoost等组合方法相比,在大部分数据集上都具有更高的分类准确率.  相似文献   

9.
蛋白质是细胞中的主要功能分子,是生命的物质基础,蛋白质的功能是通过蛋白质之间相互作用而发挥的,而蛋白质相互作用界面上只有很少数的被称之为"能量热点"的残基对相互作用贡献了大部分的结合自由能,如何识别这些能量热点是目前生物信息学领域比较热门的研究问题。其中基于机器学习的蛋白质能量热点识别中,特征选择方法的使用对识别模型的性能影响非常大。该文中,笔者通过对蛋白质能量热点识别中的特征选择方法的研究现状进行全面的分析,指出还存在的一些问题及以后改进的思路和方向,为蛋白质能量热点预测准确率的提高奠定基础。  相似文献   

10.
蛋白质相互作用位点预测为蛋白质功能和药物设计的理解提供重要线索。而蛋白质的各种特征为蛋白质相互作用位点预测提供了大量有用信息,特别是进化信息、残基序列邻近和空间邻近性。不同的蛋白质特征对蛋白质间的相互作用的贡献也不一样。通过提取蛋白质序列谱、保守性和残基熵,提出了特征融合技术对蛋白质相互作用位点进行研究,采用SVM构建三种预测器,分别对各种不同的特征加以验证,实验结果表明了基于特征融合方法的有效性和正确性。  相似文献   

11.
In this paper, we describe a machine learning approach for sequence-based prediction of protein-protein interaction sites. A support vector machine (SVM) classifier was trained to predict whether or not a surface residue is an interface residue (i.e., is located in the protein-protein interaction surface), based on the identity of the target residue and its ten sequence neighbors. Separate classifiers were trained on proteins from two categories of complexes, antibody-antigen and protease-inhibitor. The effectiveness of each classifier was evaluated using leave-one-out (jack-knife) cross-validation. Interface and non-interface residues were classified with relatively high sensitivity (82.3% and 78.5%) and specificity (81.0% and 77.6%) for proteins in the antigen-antibody and protease-inhibitor complexes, respectively. The correlation between predicted and actual labels was 0.430 and 0.462, indicating that the method performs substantially better than chance (zero correlation). Combined with recently developed methods for identification of surface residues from sequence information, this offers a promising approach to predict residues involved in protein-protein interactions from sequence information alone.  相似文献   

12.
The three-dimensional structure of racE was modeled using several homologous small G proteins, and the best model obtained using the human rhoA as modeling template is reported. The three-dimensional fold of the racE model is remarkably similar to the cellular form of human ras p21 crystal structure. Its secondary structure consists of six alpha-helices, six beta-strands and three 3(10) helices. The model retains its secondary structure after a 300 K, 300 ps molecular dynamics (MD) simulation. Important domains of the protein include its effector loop (residues 34-46), the insertion domain (residues 121-136), and the polybasic motif (between 210 and 220) not modeled in the current structure. The effector loop is inherently flexible and the structure docked with GDP exhibits the effector loop moving significantly closer to the nucleotide binding pocket, forming a tighter complex with the bound GDP. The mobility of the effector loop is conferred by a single residue 'hinge' point at residue 34Asp, also allowing the Switch I region, immediately preceding the effector loop, to be equally mobile. In comparison, the Switch II region shows average mobility. The insertion domain is highly flexible, with the insertion taking the form of a helical domain, with several charged residues forming a complex charged interface over the entire insertion region. While the GDP moiety is loosely held in the active site, the metal cation is extensively co-ordinated. The critical residue 38Thr exhibits high mobility, and is seen interacting directly with the metal ion at a distance of 2.64 A, and indirectly via an intervening water molecule. 64Gln, a key residue involved in GTP hydrolysis in ras, is seen facing the beta-phosphate group and the metal ion. Certain residues (i.e. 51Asn, 38Thr and 65Glu) exhibit unique characteristics and these residues, together with 158Val, may play important roles in the maintenance of the protein's integrity and function. There is strong consensus of secondary structural elements between models generated using various templates, such as h-rac1, h-rhoA and h-cdc42 bound to RhoGDI, all sharing only 50-55% sequence identity with racE, which suggests that this model is in all probability an accurate prediction of the true tertiary structure of racE.  相似文献   

13.
In this paper, a machine learning approach, known as support vector machine (SVM) is employed to predict the distance between antibody’s interface residue and antigen in antigen–antibody complex. The heavy chains, light chains and the corresponding antigens of 37 antibodies are extracted from the antibody–antigen complexes in protein data bank. According to different distance ranges, sequence patch sizes and antigen classes, a number of computational experiments are conducted to describe the distance between antibody’s interface residue and antigen with antibody sequence information. The high prediction accuracy of both self-consistent and cross-validation tests indicates that the sequential discovered information from antibody structure characterizes much in predicting the distance between antibody’s interface residue and antigen. Furthermore, the antigen class is predicted from residue composition information that belongs to different distance range by SVM, which shows some potential significance.  相似文献   

14.
蛋白质二级结构预测方法研究   总被引:2,自引:2,他引:0       下载免费PDF全文
为提高蛋白质二级结构预测精度,提出一种新的网络模型和编码方法。首先利用基因表达式编程(GEP)的全局搜索能力同时进化设计神经网络的结构和连接权;其次,对神经网络输入层编码进行了改进,添加了氨基酸残基所处的疏水环境。用PDBSelect25中的36条蛋白质共6 122个残基进行测试,结果表明提出的网络模型和编码方法能有效提高蛋白质二级结构预测的精度。  相似文献   

15.
Porphyromonas gingivalis is a major periodontitis-causing pathogens. P. gingivalis secrete a cysteine protease termed RgpB, which is specific for Arg-Xaa bonds in substrates. Recently, a nanobody-based assay was used to demonstrate that RgpB could represent a novel diagnostic target, thereby simplifying. P. gingivalis detection. The nanobody, VHH7, had a high binding affinity and was specific for RgpB, when tested towards the highly identical RgpA.In this study a homology model of VHH7 was build. The complementarity determining regions (CDR) comprising the paratope residues responsible for RgpB binding were identified and used as input to the docking. Furthermore, residues likely involved in the RgpB epitope was identified based upon RgpB:RgpA alignment and analysis of residue surface accessibility. CDR residues and putitative RgpB epitope residues were used as input to an information-driven flexible docking approach using the HADDOCK server. Analysis of the VHH7:RgpB model demonstrated that the epitope was found in the immunoglobulin-like domain and residue pairs located at the molecular paratope:epitope interface important for complex stability was identified.Collectively, the VHH7 homology model and VHH7:RgpB docking supplies knowledge of the residues involved in the high affinity interaction. This information could prove valuable in the design of an antibody-drug conjugate for specific RgpB targeting.  相似文献   

16.
Intersurf: dynamic interface between proteins   总被引:1,自引:0,他引:1  
Protein docking is a fundamental biological process that links two proteins. This link is typically defined by an interaction between two large zones of the protein boundaries. Visualizing such an interface is useful to understand the process thanks to 3D protein structures, to estimate the quality of docking simulation results, and to classify interactions in order to predict docking affinity between classes of interacting zones. Since the interface may be defined by a surface that separates the two proteins, it is possible to create a map of interaction that allows comparisons to be performed in 2D. This paper presents a very fast algorithm that extracts an interface surface and creates a valid and low-distorted interaction map. Another benefit of our approach is that a pre-computed part of the algorithm enables the surface to be updated in real-time while residues are moved.  相似文献   

17.
Hfq is an abundant RNA-binding bacterial protein that was first identified in E. coli as a required host factor for phage Qβ RNA replication. The pleiotrophic phenotype resulting from the deletion of Hfq predicates the importance of this protein. Two RNA-binding sites have been characterized: the proximal site which binds sRNA and mRNA and the distal site which binds poly(A) tails. Previous studies mainly focused on the key residues in the proximal site of the protein. A recent mutation study in E. coli Hfq showed that a distal residue Val43 is important for the protein function. Interestingly, when we analyzed the sequence and structure of Staphylococcus aureus Hfq using the CONSEQ server, the results elicited that more functional residues were located far from the nucleotide-binding portion (NBP). From the analysis seven individual residues Asp9, Leu12, Glu13, Lys16, Gln31, Gly34 and Asp40 were selected to investigate the conformational changes in Hfq–RNA complex due to point mutation effect of those residues using molecular dynamics simulations. Results showed a significant effect on Asn28 which is an already known highly conserved functionally important residue. Mutants D9A, E13A and K16A depicted effects on base stacking along with increase in RNA pore diameter, which is required for the threading of RNA through the pore for the post-translational modification. Further, the result of protein stability analysis by the CUPSAT server showed destabilizing effect in the most mutants. From this study we characterized a series of important residues located far from the NBP and provide some clues that those residues may affect sRNA binding in Hfq.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号