首页 | 本学科首页   官方微博 | 高级检索  
     

基于模糊蚁群的加权蛋白质复合物识别算法
引用本文:毛伊敏,刘银萍,胡 健.基于模糊蚁群的加权蛋白质复合物识别算法[J].计算机应用研究,2020,37(5):1340-1348.
作者姓名:毛伊敏  刘银萍  胡 健
作者单位:江西理工大学 信息工程学院,江西 赣州341000;江西理工大学应用科学学院 信息工程系,江西 赣州341000
基金项目:国家自然科学基金;江西省自然科学基金;江西省教育厅科技项目
摘    要:针对蚁群融合模糊C-means (FCM)聚类算法在蛋白质相互作用网络中进行复合物识别的准确率不高、召回率较低以及时间性能不佳等问题进行了研究,提出一种基于模糊蚁群的加权蛋白质复合物识别算法FAC-PC(algorithm for identifying weighted protein complexes based on fuzzy ant colony clustering)。首先,融合边聚集系数与基因共表达的皮尔森相关系数构建加权网络;其次提出EPS(essential protein selection)度量公式来选取关键蛋白质,遍历关键蛋白质的邻居节点,设计蛋白质适应度PFC(protein fitness calculation)来获取关键组蛋白质,利用关键组蛋白质替换种子节点进行蚁群聚类,克服蚁群算法中因大量拾起放下和重复合并过滤操作而导致准确率和收敛速度过慢的缺陷;接着设计SI(similarity improvement)度量优化拾起放下概率来对节点进行蚁群聚类进而获得聚类数目;最后将关键蛋白质和通过蚁群聚类得到的聚类数目初始化FCM算法,设计隶属度更新策略来优化隶属度的更新,同时提出兼顾类内距和类间距的FCM迭代目标函数,最终利用改进的FCM完成复合物的识别。将FAC-PC算法应用在DIP数据上进行复合物的识别,实验结果表明FAC-PC算法的准确率和召回率较高,能够较准确地识别蛋白质复合物。

关 键 词:蛋白质相互作用网络  蚁群聚类算法  模糊C-means  适应度  蛋白质复合物
收稿时间:2018/10/13 0:00:00
修稿时间:2020/3/11 0:00:00

Algorithm for identifying weighted protein complexes based on fuzzy ant colony clustering
Yimin Mao,Yinping Liu and Jian Hu.Algorithm for identifying weighted protein complexes based on fuzzy ant colony clustering[J].Application Research of Computers,2020,37(5):1340-1348.
Authors:Yimin Mao  Yinping Liu and Jian Hu
Affiliation:School of Information Engineering Jiangxi University of Science Technology,Ganzhou Jiangxi,,
Abstract:Aiming at the problem that the accuracy and recall of the protein complexes identification algorithm based on ant colony and fuzzy C-means(FCM) clustering are not high and the running efficiency is low, this paper proposed a novel protein complex recognition algorithm named FAC-PC. Firstly, combing with the Pearson correlation coefficient and edge aggregation coefficient, the algorithm constructed the weighted protein network. Secondly, in order to overcome the defects of massive merger and filter, repeated pick-up and drop-down operations in ant colony clustering algorithm, it designed the EPS metric to select essential protein, and designed the PFC metric to traverse neighbors of essential proteins to obtain essential group proteins. Then it used the essential group protein to replace the seed node in the process of ant colony clustering, which resulted that the accuracy and time performance were improved. Furthermore, it proposed the SI metric to optimize the probability of picking and dropping operations of ant colony to obtain the number of clustering. Finally, according to the improved ant colony algorithm, it obtained the essential protein and the number of clustering to initialize the FCM algorithm, and designed the membership update strategy to optimize the membership update, at the same time, it proposed a new FCM objective function which took a balance between intra-clustering and inter-clustering variation, and finally identified the protein complex by improved FCM algorithm. This paper used FAC-PC algorithm to identify protein complexes on DIP data. The experimental results show that FAC-PC algorithm has better performance on accuracy and recall, which is more reasonable to identify protein complexes.
Keywords:protein-protein interaction network  ant colony clustering algorithm  fuzzy C-means  fitness  protein complex
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号