首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Motivation: Bringing a new drug to the market is expensive and time-consuming. To cut the costs and time, computer-aided drug design (CADD) approaches have been increasingly included in the drug discovery pipeline. However, despite traditional docking tools show a good conformational space sampling ability, they are still unable to produce accurate binding affinity predictions. This work presents a novel scoring function for molecular docking seamlessly integrated into DockingApp, a user-friendly graphical interface for AutoDock Vina. The proposed function is based on a random forest model and a selection of specific features to overcome the existing limits of Vina’s original scoring mechanism. A novel version of DockingApp, named DockingApp RF, has been developed to host the proposed scoring function and to automatize the rescoring procedure of the output of AutoDock Vina, even to nonexpert users. Results: By coupling intermolecular interaction, solvent accessible surface area features and Vina’s energy terms, DockingApp RF’s new scoring function is able to improve the binding affinity prediction of AutoDock Vina. Furthermore, comparison tests carried out on the CASF-2013 and CASF-2016 datasets demonstrate that DockingApp RF’s performance is comparable to other state-of-the-art machine-learning- and deep-learning-based scoring functions. The new scoring function thus represents a significant advancement in terms of the reliability and effectiveness of docking compared to AutoDock Vina’s scoring function. At the same time, the characteristics that made DockingApp appealing to a wide range of users are retained in this new version and have been complemented with additional features.  相似文献   

14.
The Golgi Apparatus (GA) is a major collection and dispatch station for numerous proteins destined for secretion, plasma membranes and lysosomes. The dysfunction of GA proteins can result in neurodegenerative diseases. Therefore, accurate identification of protein subGolgi localizations may assist in drug development and understanding the mechanisms of the GA involved in various cellular processes. In this paper, a new computational method is proposed for identifying cis-Golgi proteins from trans-Golgi proteins. Based on the concept of Common Spatial Patterns (CSP), a novel feature extraction technique is developed to extract evolutionary information from protein sequences. To deal with the imbalanced benchmark dataset, the Synthetic Minority Over-sampling Technique (SMOTE) is adopted. A feature selection method called Random Forest-Recursive Feature Elimination (RF-RFE) is employed to search the optimal features from the CSP based features and g-gap dipeptide composition. Based on the optimal features, a Random Forest (RF) module is used to distinguish cis-Golgi proteins from trans-Golgi proteins. Through the jackknife cross-validation, the proposed method achieves a promising performance with a sensitivity of 0.889, a specificity of 0.880, an accuracy of 0.885, and a Matthew’s Correlation Coefficient (MCC) of 0.765, which remarkably outperforms previous methods. Moreover, when tested on a common independent dataset, our method also achieves a significantly improved performance. These results highlight the promising performance of the proposed method to identify Golgi-resident protein types. Furthermore, the CSP based feature extraction method may provide guidelines for protein function predictions.  相似文献   

15.
16.
17.
采用分子模拟和多元线性回归方法,研究了有机物临界体积的定量结构-性质关系(QSPR).基于121个临界体积数据中的108个样本点,得到一个临界体积模型(残余标准差11.80 cm3/mol,拟合度0.994 2).该模型含4个分子描述码,对训练组和测试组的平均估算误差分别为8.36cm3/mol(相对误差2.52%)和9.09 cm3/mol(相对误差3.05%).研究表明,分子体积、分子支链化程度以及分子表面的静电分布等定量结构参数可以有效地估算有机物的临界体积.  相似文献   

18.
S-palmitoylation is a reversible covalent post-translational modification of cysteine thiol side chain by palmitic acid. S-palmitoylation plays a critical role in a variety of biological processes and is engaged in several human diseases. Therefore, identifying specific sites of this modification is crucial for understanding their functional consequences in physiology and pathology. We present a random forest (RF) classifier-based consensus strategy (RFCM-PALM) for predicting the palmitoylated cysteine sites on synaptic proteins from male/female mouse data. To design the prediction model, we have introduced a heuristic strategy for selection of the optimum set of physicochemical features from the AAIndex dataset using (a) K-Best (KB) features, (b) genetic algorithm (GA), and (c) a union (UN) of KB and GA based features. Furthermore, decisions from best-trained models of the KB, GA, and UN-based classifiers are combined by designing a three-star quality consensus strategy to further refine and enhance the scores of the individual models. The experiment is carried out on three categorized synaptic protein datasets of a male mouse, female mouse, and combined (male + female), whereas in each group, weighted data is used as training, and knock-out is used as the hold-out set for performance evaluation and comparison. RFCM-PALM shows ~80% area under curve (AUC) score in all three categories of datasets and achieve 10% average accuracy (male—15%, female—15%, and combined—7%) improvements on the hold-out set compared to the state-of-the-art approaches. To summarize, our method with efficient feature selection and novel consensus strategy shows significant performance gains in the prediction of S-palmitoylation sites in mouse datasets.  相似文献   

19.
AKT, is a serine/threonine protein kinase comprising three isoforms—namely: AKT1, AKT2 and AKT3, whose inhibitors have been recognized as promising therapeutic targets for various human disorders, especially cancer. In this work, we report a systematic evaluation of multi-target Quantitative Structure-Activity Relationship (mt-QSAR) models to probe AKT’ inhibitory activity, based on different feature selection algorithms and machine learning tools. The best predictive linear and non-linear mt-QSAR models were found by the genetic algorithm-based linear discriminant analysis (GA-LDA) and gradient boosting (Xgboost) techniques, respectively, using a dataset containing 5523 inhibitors of the AKT isoforms assayed under various experimental conditions. The linear model highlighted the key structural attributes responsible for higher inhibitory activity whereas the non-linear model displayed an overall accuracy higher than 90%. Both these predictive models, generated through internal and external validation methods, were then used for screening the Asinex kinase inhibitor library to identify the most potential virtual hits as pan-AKT inhibitors. The virtual hits identified were then filtered by stepwise analyses based on reverse pharmacophore-mapping based prediction. Finally, results of molecular dynamics simulations were used to estimate the theoretical binding affinity of the selected virtual hits towards the three isoforms of enzyme AKT. Our computational findings thus provide important guidelines to facilitate the discovery of novel AKT inhibitors.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号