首页 | 本学科首页   官方微博 | 高级检索  
     

基于图嵌入与拓扑结构信息的蛋白质复合物识别算法
引用本文:徐周波,李萍,刘华东,李珍.基于图嵌入与拓扑结构信息的蛋白质复合物识别算法[J].计算机工程与科学,2021,43(6):1052-1059.
作者姓名:徐周波  李萍  刘华东  李珍
作者单位:(桂林电子科技大学广西可信软件重点实验室,广西 桂林 541004)
基金项目:国家自然科学基金(61762027,U1501252);广西自然科学基金(2017GXNSFAA198172)
摘    要:蛋白质复合物是细胞结构和生化机制的研究基础,如何准确识别蛋白质复合物成为近年来的研究热点。针对传统算法根据结构信息对蛋白质复合物进行搜索存在敏感度和F-measure低的问题,以及现有监督学习算法根据人为构造特征进行蛋白质复合物识别存在特征构造不能较好地反映图的真实信息等不足,提出了graph2vec-SVM识别算法。将蛋白质复合物看作稠密子图并考虑子图模块度大小,利用graph2vec将图信息转换为向量,并进一步采用SVM分类器对蛋白质复合物进行识别,提高了蛋白质复合物识别的敏感度和F-measure。该算法分别与目前流行的4种非监督学习算法(ClusterOne、CMC、HC-PIN和COACH)和3种监督学习算法(SCI-BN、SCI-SVM和RM)进行比较,在精准度、敏感度和F-measure 3项指标上都显示出了良好的性能。

关 键 词:蛋白质复合物  gragh2vec  SVM  蛋白质相互作用网络  
收稿时间:2020-02-28
修稿时间:2020-06-21

A protein complex recognition algorithm based on graph embedding and topological structure information
XU Zhou-bo,LI Ping,LIU Hua-dong,LI Zhen.A protein complex recognition algorithm based on graph embedding and topological structure information[J].Computer Engineering & Science,2021,43(6):1052-1059.
Authors:XU Zhou-bo  LI Ping  LIU Hua-dong  LI Zhen
Affiliation:(Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin 541004,China)
Abstract:Protein complex is the basis of cell structure and biochemical mechanism. How to recognize protein complex accurately has become a popular research direction in recent years. Traditional algorithms has low sensitivity and F-measure in searching protein complexes based on structural information, and the artificial construction features can not reflect the real information of the graph when the existing supervised learning algorithms use machine learning algorithms to identify protein complexes. In order to solve the aforementioned problems, a graph2vec SVM recognition algorithm is proposed. In this algorithm, the protein complex is regarded as a dense subgraph, and the modularity of the subgraph is considered. graph2vec technology is used to transform the graph information into vectors, and SVM classifier is used to recognize the protein complex, which improves the sensitivity of protein complex re- cognition and F-measure. Compared with four popular unsupervised learning algorithms (ClusterONE, CMC,HC-PIN and Coach) and three supervised learning algorithms (SCI-BN, SCI-SVM and RM), the algorithm shows good performance in terms of accuracy, sensitivity and F-measure.
Keywords:protein complex  gragh2vec  support vector machine  protein-protein interaction network  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号