首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
为建立化合物降解的计算机预测模型,确定降解和非降解化合物显然不同的参数.选择389个有机分子作为数据集,选其中312个为训练集,其余77个为验证集,每个分子计算195个分子参数,分别采用逐步判别法和主成分分析法建模,并用外部验证集验证模型的预测能力.结果:逐步判别法分析结果中,训练集的降解和非降解化合物的正确率分别为90.6%和69.5%;验证集的降解和非降解化合物的正确率分别为83.9%和63.6%.主成分分析结果在测试集中,降解和非降解化合物的正确率分别为80.4%和31.8%.验证集的降解化合物和非降解化合物的正确率分别为67.9%和50.0%.因此,采用逐步判别法模拟而建立的数学模型,可作为预测化合物降解的模型.以卜研究可以为预测有机物降解提供参考.  相似文献   

2.
杨华晖  孟晨  王成  姚运志 《控制与决策》2019,34(6):1219-1226
针对高维数据聚类中K-means算法无法有效抑制噪声特征、实现不规则形状聚类的缺点,提出一种基于目标点特征选择和去除的改进K-均值聚类算法.该算法使用闵可夫斯基规度作为评价距离进行目标点的分类,增设权重调节参数a、重置权重系数α进行特征选择和去除,可有效减小非聚类指标特征带来的噪声影响.算法验证实验选取UCI真实数据集和人工数据集进行聚类分析,验证改进算法对抑制噪声特征的有效性,与WK-means、iMWK-means算法进行实验对比,分析聚类学习时特征选择的适用性,同时寻找最优的距离系数beta和权重系数α.  相似文献   

3.
利用比较分子力场分析法(CoMFA),以5,6-二氢-(9H)-吡唑[3,4-c]-1,2,4一三唑[4,3-a]吡啶类抑制剂为研究对象,建立一组对嗜酸性粒细胞磷酸二酯酶有抑制活性的化合物及其三维定量构效关系(3D-QSAR)模型,探索化合物活性数据和三维结构参数之间的关系.模型的交叉验证相关系数q2=0.565,非交叉验证相关系数r2=0.867,标准偏差SE=0.362,F=49.782,立体场和静电场的贡献值分别为72.7%和27.3%.该模型的预测能力较好,能够增大取代基体积和降低取代基电负性,可以提高该类化合物的活性.  相似文献   

4.
基于样本空间分布密度的初始聚类中心优化K-均值算法*   总被引:2,自引:1,他引:1  
针对传统K-均值聚类算法对初始聚类中心敏感、现有初始聚类中心优化算法缺乏客观性,提出一种基于样本空间分布密度的初始聚类中心优化K-均值算法。该算法利用数据集样本的空间分布信息定义数据对象的密度,并根据整个数据集的空间信息定义了数据对象的邻域;在此基础上选择位于数据集样本密集区且相距较远的数据对象作为初始聚类中心,实现K-均值聚类。UCI机器学习数据库数据集以及随机生成的带有噪声点的人工模拟数据集的实验测试证明,本算法不仅具有很好的聚类效果,而且运行时间短,对噪声数据有很强的抗干扰性能。基于样本空间分布密度的初始聚类中心优化K-均值算法优于传统K-均值聚类算法和已有的相关K-均值初始中心优化算法。  相似文献   

5.
为了提高传统K-均值聚类的稳定性和可靠性,提出了一种自适应的K-均值聚类算法,其基本思想是通过分析样本集的最小树并切割其中所有超过一定阈值的较长边,根据样本集的结构特征事先自动地计算出合理的聚类个数和合理的初始聚类中心.理论分析和计算实验表明,该算法不仅能够保证聚类结果的惟一性,而且在样本集的各个聚类具有大致凸的形状时,如果类间距离明显大于类内距离,不需要人工选择参数就能直接获得较好的聚类结果.对于同样的数据集而言,即使选择了正确的聚类个数,传统的K-均值算法也可能给出不合理的聚类结果,因此自适应的K-均值聚类算法具有更好的性能.  相似文献   

6.
针对二分K-均值算法由于随机选取初始中心及人为定义聚类数而造成的聚类结果不稳定问题,提出了基于密度和中心指标的Canopy二分K-均值算法SDC_Bisecting K-Means。首先计算样本中数据密度及其邻域半径;然后选出密度最小的数据并结合Canopy算法的思想进行聚类,将得到的簇的个数及其中心作为二分K-均值算法的输入参数;最后在二分K-均值算法的基础上引入指数函数和中心指标对原始样本进行聚类。利用UCI数据集和自建数据集进行模拟实验对比,结果表明SDC_Bisecting K-Means不仅使得聚类结果更精确,同时算法的运行速度更快、稳定性更好。  相似文献   

7.
提出了一种新的基于PCA和K-均值聚类的有监督二叉分裂层次聚类方法PCASHC,用K-均值聚类进行逐次二叉聚簇分裂,选择PCA第一主成分相距最远样本点作为K-均值聚类初始聚簇中心,解决了K-均值聚类初始中心随机选择导致结果不确定的问题,用聚簇样本类别方差作为聚簇样本不纯度控制聚簇分裂水平,避免过拟合,可学习到合适的聚类数目。用四组UCI标准数据集对其进行了10折交叉验证分类误差检验,与另外七种分类器相比说明PCASHC有较高的分类精度。  相似文献   

8.
指定K个聚类的多均值聚类算法在K-均值算法的基础上设置了多个次类,以改善K-均值算法在非凸数据集上的劣势,并将多均值聚类问题形式化为优化问题,可以得到更优的聚类效果。但是该算法对初始原型敏感,且随机选取原型的方式使聚类结果不稳定。针对上述问题,提出一种稳定的K-多均值聚类算法,并对该算法的复杂度与收敛性进行了简要讨论。该算法先基于数据样本的最邻近关系构造图,根据图的连通分支将数据分为若干组,取每组数据的均值点作为初始原型,再用交替迭代的方法对优化问题进行求解,得到最后的聚类结果。在人工数据集和真实数据集上的实验表明,该算法具有更稳定更优越的聚类效果。  相似文献   

9.
细胞色素P450 2C9 (Cytochrome P450 2C9)是人体肝脏中重要的代谢酶,参与多种药物代谢,约占CYP450蛋白总量的15%~20%。利用深度学习思想,提出基于深度信念网络DBN (Deep Belief Network)的CYP450 2C9抑制性分类模型。实验选用13 000个化合物作为数据集,采用Pub Chem和MACCS分子指纹进行分子结构表征。利用DBN的半监督学习方式从预处理后的特征中学习更本质的特征表示,避免人工提取特征的过程,实现CYP450 2C9的抑制性分类。实验结果表明:在同等条件下,DBN相比于SVM和ANN具有明显优势,平均分类准确率为80.6%,灵敏度(SE)为86.9%,特异性(SP)为66.2%,对药物筛选和新药研发具有积极意义。  相似文献   

10.
应用K-均值聚类的方法区分源于不同目标的观测数据,通过类间数据融合,实现对多目标的实时跟踪。研究了观测数据K-均值聚类的基本思想、聚类处理过程及算法实现,讨论了对机动目标跟踪的Kalman滤波方程及空管系统中易于计算的各参数矩阵理论依据及相应的初值。发现通过K-均值聚类能很好区分不同目标,聚类后再进行跟踪融合更加准确。仿真结果表明,经K-均值聚类处理后的滤波跟踪航迹效果较好。  相似文献   

11.
12.
13.
14.
15.
16.
血管紧张素转换酶抑制剂(ACEI)对高血压的治疗具有重要意义。基于从结构复杂的化合物数据库中构建的候选小分子数据集,采用分子对接技术从数据集中筛选出样本构建分类模型。分别采用支持向量机、[K]近邻、决策树、随机森林和贝叶斯方法建立血管紧张素转换酶潜在抑制剂和非抑制剂的分类模型。经结果对比,支持向量机相比于其他方法有更高的预测率,其中模型总体预测率和相关系数分别为82.4%和0.653。研究表明,支持向量机方法对于虚拟筛选血管紧张素转换酶抑制剂具有良好的效果。  相似文献   

17.
A three-dimensional (3D) pharmacophore modelling approach was applied to a diverse data set of known cyclin-dependent kinase 9 (CDK9) inhibitors. Diversity sampling and principal components analysis (PCA) were employed to ensure the rational selection of representative training sets. Twelve statistically robust pharmacophore models were generated using the HypoGen algorithm. The resulting models showed high homology and indicated great convergence in ascertaining pharmacophoric features essential for CDK9 inhibitory activity. One of the best models (Hypo 6) was assessed further by external predictive capability, randomization test, as well as its performance in virtual screening. The capability of the resulting models to reliably predict the inhibitory activity of external data sets and discriminate active structures from general databases would assist the identification and optimization of novel CDK9 inhibitors.  相似文献   

18.
Integrase (IN) is a key viral enzyme for the replication of the type-1 human immunodeficiency virus (HIV-1), and as such constitutes a relevant therapeutic target for the development of anti-HIV agents. However, the lack of crystallographic data of HIV IN complexed with the corresponding viral DNA has historically hindered the application of modern structure-based drug design techniques to the discovery of new potent IN inhibitors (INIs). Consequently, the development and validation of reliable HIV IN structural models that may be useful for the screening of large databases of chemical compounds is of particular interest. In this study, four HIV-1 IN homology models were evaluated respect to their capability to predict the inhibition potency of a training set comprising 36 previously reported INIs with IC50 values in the low nanomolar to the high micromolar range. Also, 9 inactive structurally related compounds were included in this training set. In addition, a crystallographic structure of the IN-DNA complex corresponding to the prototype foamy virus (PFV) was also evaluated as structural model for the screening of inhibitors. The applicability of high throughput screening techniques, such as blind and ligand-guided exhaustive rigid docking was assessed. The receptor models were also refined by molecular dynamics and clustering techniques to assess protein sidechain flexibility and solvent effect on inhibitor binding. Among the studied models, we conclude that the one derived from the X-ray structure of the PFV integrase exhibited the best performance to rank the potencies of the compounds in the training set, with the predictive power being further improved by explicitly modeling five water molecules within the catalytic side of IN. Also, accounting for protein sidechain flexibility enhanced the prediction of inhibition potencies among the studied compounds. Finally, an interaction fingerprint pattern was established for the fast identification of potent IN inhibitors. In conclusion, we report an exhaustively validated receptor model if IN that is useful for the efficient screening of large chemical compounds databases in the search of potent HIV-1 IN inhibitors.  相似文献   

19.
乙酰胆碱酯酶抑制剂虚拟筛选方法研究   总被引:3,自引:1,他引:3  
在虚拟筛选过程中,虚拟筛选策略和方法是获取结果的基础,但需要通过实验数据来检验。获得虚拟筛选合理的限制因素,将为大规模虚拟筛选提供方法依据。通过建立乙酰胆碱酯酶抑制剂的活性检测方法,检测了4000多个化合物的生物活性,并发现了34个活性较高的化合物,同时将设计的6个不同的虚拟筛选方法分别用于2个乙酰胆碱醋酶抑制剂虚拟筛选模型。将所预测的可能活性化合物及两模型共有的可能活性化合物分别与实验所得的活性化合物对照,综合分析讨论虚拟筛选乙酰胆碱酯酶抑制剂的重要因素和进一步富集活性化合物的方法,为大规模的虚拟筛选乙酰胆碱醋酶抑制剂提供可靠依据,并为其他基于蛋白酶结构的虚拟筛选提供参考。  相似文献   

20.
The determination of inhibitory effects that lead compounds have on cytochrome P450 (CYP) enzymes is an important part of today's drug discovery process. Assays can be performed early in the discovery process to predict adverse drug-drug interactions caused by CYP inhibition and to minimize the costs associated with terminating candidates in late stage development or worse, removing a drug from the market after launch. For early discovery work, testing substantial numbers of compounds is desirable, thus automated "mix and read" assays are beneficial. Here, we demonstrate the automation of the CYP profiling process using a simple, yet robust robotic platform. Compound titration, as well as transfer of compounds and assay components was performed by the same automated pipetting system. IC(50)s of small molecule drugs were determined using recombinant CYP enzymes, CYP3A4, -2C9, and -2D6 and luminogenic substrates specific to each. Compounds were profiled against all three enzymes on the same 384-well assay plate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号