首页 | 官方网站   微博 | 高级检索  
     

基于强化学习的HP模型优化方法研究
引用本文:吴宏杰,杨 茹,傅启明,陈建平,陆卫忠.基于强化学习的HP模型优化方法研究[J].计算机工程与应用,2019,55(12):132-139.
作者姓名:吴宏杰  杨 茹  傅启明  陈建平  陆卫忠
作者单位:1.苏州科技大学 电子与信息工程学院,江苏 苏州 215009 2.苏州大学 江苏省计算机信息处理技术重点实验室,江苏 苏州 215006
基金项目:国家自然科学基金(No.61772357,No.61672371,No.61502329,No.61876217);江苏省“333工程”科研项目,六大人才高峰项目(No.DZXX-010);苏州市科技项目(No.SNG201610,No.SYG201704);苏州科技大学江苏省建筑智慧节能重点实验室项目
摘    要:蛋白质结构预测问题一直是生物信息学中的重要问题。基于疏水极性模型的蛋白质二维结构预测问题是一个典型的NP难问题。目前疏水极性模型优化的方法有贪心算法、粒子群算法、遗传算法、蚁群算法和蒙特卡罗模拟方法等,但这些方法成功收敛的鲁棒性不高,容易陷入局部最优。由此提出一种基于强化学习的HP模型优化方法,利用其连续马尔可夫最优决策与最大化全局累计回报的特点,在全状态空间中,构建基于能量函数的奖赏函数,引入刚性重叠检测规则,充分挖掘生物序列中的全局进化关系,从而进行有效与稳定的预测。以3条经典论文序列和5条Uniref50序列为实验对象,与贪心算法和粒子群算法分别进行了鲁棒性、收敛性与运行时间的比较。贪心算法只能在62.5%的序列上进行收敛,该文方法能在5万次训练后稳定的在所有序列上达到了收敛。与粒子群算法相比,两者都能找到最低能量结构,但该文的运行时间较粒子群算法降低了63.9%。

关 键 词:强化学习  HP模型  结构预测

Research on HP Model Optimization Method Based on Reinforcement Learning
WU Hongjie,YANG Ru,FU Qiming,CHEN Jianping,LU Weizhong.Research on HP Model Optimization Method Based on Reinforcement Learning[J].Computer Engineering and Applications,2019,55(12):132-139.
Authors:WU Hongjie  YANG Ru  FU Qiming  CHEN Jianping  LU Weizhong
Affiliation:1.School of Electronics and Information Engineering, Suzhou University of Science and Technology, Suzhou, Jiangsu 215009, China 2.Jiangsu Provincial Key Lab for Information Processing Technologies, Soochow University, Suzhou, Jiangsu 215006, China
Abstract:Protein structure prediction is an important factor in the area of bioinformatics. Predicting the two-dimensional structure of proteins based on the Hydrophobic Polarity model(HP model) is a typical Non-deterministic Polynomial(NP)-hard problem. Currently, HP model optimization methods include the greedy algorithm, particle swarm optimization, genetic algorithm, ant colony algorithm and the Monte-Carlo simulation method. However, the robustness of these methods are not sufficient, and it is easy to fall into a local optimum. Therefore, a HP model optimization method, based on reinforcement learning is proposed. In the full state space, a reward function based on energy function is designed and a rigid overlap detection rule is introduced. By using the characteristics of the continuous Markov optimal decision and maximizing global cumulative return, the global evolutionary relationship in biological sequences is fully exploited, and effective and stable predictions are retrieved. Eight classical sequences from publications and Uniref50 are selected as experimental objects. The robustness, convergence and running time are compared with the greedy algorithm and particle swarm optimization algorithm, respectively. Both reinforcement method and swarm optimization method can find all the lowest energy structures for these eight sequences, while the greedy algorithm only detects 62.5%. Compared with particle swarm optimization, the running time of the reinforcement method is 63.9% lower than that of particle swarm optimization.
Keywords:reinforcement learning  Hydrophobic Polarity(HP) model  structure prediction  
本文献已被 维普 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号