首页 | 本学科首页   官方微博 | 高级检索  
     

蛋白质二级结构的协同训练预测方法*
引用本文:刘君,熊忠阳,王银辉.蛋白质二级结构的协同训练预测方法*[J].计算机应用研究,2011,28(5):1688-1691.
作者姓名:刘君  熊忠阳  王银辉
作者单位:1. 重庆大学,计算机学院,重庆,400030;重庆市广播电视大学理工学院,重庆,400052
2. 重庆大学,计算机学院,重庆,400030
基金项目:中国博士后科学基金资助项目(20070420711);重庆市科委自然科学基金资助项目(2007BB2372)
摘    要:针对蛋白质二级结构机器学习预测方法,忽略氨基酸疏水性特征以及氨基酸之间的长程作用和准确率不高的现状,进行了比较实验分析。采用氨基酸对应的疏水能值替换蛋白质中相应的氨基酸,得到疏水能值的序列实验结果表明,用长的疏水能值序列,训练BP网络,对长程作用起主导的E结构的预测效果好。由于Profile编码特征和疏水能值特征是独立的冗余视图,基于协同训练思想,提出Cotraining算法。该算法的主要步骤是在Profile特征空间训练SVM分类器,在疏水性特征空间训练BP神经网络分类器,协同对氨基酸二级结构进行预测

关 键 词:协同训练    蛋白质    二级结构预测    支持向量机    神经网络
收稿时间:2010/10/29 0:00:00
修稿时间:2011/4/25 0:00:00

Protein secondary structure co-training prediction method
LIU Jun,XIONG Zhong-yang,WANG Yin-hui.Protein secondary structure co-training prediction method[J].Application Research of Computers,2011,28(5):1688-1691.
Authors:LIU Jun  XIONG Zhong-yang  WANG Yin-hui
Affiliation:LIU Jun1,2,XIONG Zhong-yang1,WANG Yin-hui1 (1.College of Computer Science,Chongqing University,Chongqing 400030,China,2.College of Science & Technology,Chongqing Radio & Television University,Chongqing 400052,China)
Abstract:Machine learning based protein secondary structure prediction methods suffer low prediction accuracy because they ignore the amino acid hydrophobic property and the interaction between far away amino acids. A sequence of hydrophobic value can be build by replacing the amino acid by its hydrophobic value. Experiments show that the BP neural network using long amino hydrophobic value sequence works well in prediction of E structure which is controlled mainly by long amino acid-amino acid interaction. Because both the Profile space and the hydrophobic energy value space are sufficient and redundant views, this paper proposes a Co-training algorithm. In the proposed algorithm, there are two classifiers. One is SVM classifier trained in Profile space, and the other is BP neural network classifier trained in hydrophobic value space, and they predict one amino acid secondary structure independently. If these two classifiers have different prediction results with one amino acid, an arbitration rule proposed in this paper is employed to make the final decision which is based on an active selecting strategy. Suspected sample and creditable sample are defined according to the characteristics of the classifiers and spaces to arbitrate the controversial prediction results. The experimental results show that the proposed algorithm has higher prediction accuracy both in E structure which controlled mainly by long interaction and H structure which controlled mainly by short interaction than existing algorithms.
Keywords:Co-training  protein  Secondary structure prediction  SVM  neural network
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号