首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 831 毫秒
1.
为建立化合物降解的计算机预测模型,确定降解和非降解化合物显然不同的参数.选择389个有机分子作为数据集,选其中312个为训练集,其余77个为验证集,每个分子计算195个分子参数,分别采用逐步判别法和主成分分析法建模,并用外部验证集验证模型的预测能力.结果:逐步判别法分析结果中,训练集的降解和非降解化合物的正确率分别为90.6%和69.5%;验证集的降解和非降解化合物的正确率分别为83.9%和63.6%.主成分分析结果在测试集中,降解和非降解化合物的正确率分别为80.4%和31.8%.验证集的降解化合物和非降解化合物的正确率分别为67.9%和50.0%.因此,采用逐步判别法模拟而建立的数学模型,可作为预测化合物降解的模型.以卜研究可以为预测有机物降解提供参考.  相似文献   

2.
针对基于传统深度学习的视频预测中对数据空间特征提取效果不佳及预测精度低的问题,提出一种结合内卷与卷积算子(CICO)的视频预测模型。该模型主要通过以下三个方面提高视频序列的预测性能:首先,采用不同大小的卷积核增强对数据多粒度空间特征的提取能力,较大的卷积核能够提取更大空间范围的特征,而较小的卷积核可更精确地捕获视频目标的运动细节,实现对目标多角度表征学习;其次,用计算效率更高、参数更少的内卷算子替代核较大的卷积算子,内卷通过高效的通道间交互避免了大量的不必要参数,在降低计算和存储成本的同时提升模型预测能力;最后,引入核为1×1的卷积进行线性映射,增强不同特征之间的联合表达,提高了模型参数的利用效率并增强了预测的鲁棒性。通过多个数据集对该模型进行全面测试,结果表明,相较于目前最优的SimVP(Simpler yet better Video Prediction)模型,所提模型在多项指标上均有显著提升。在移动手写数据集上,均方误差和平均绝对误差分别降低25.2%和17.4%;在北京交通数据集上,均方误差降低1.2%;在人体行为数据集上,结构相似性指数和峰值信噪比分别提高0.66%和0.4...  相似文献   

3.
基于SVM的白酒红外光谱分析方法研究   总被引:1,自引:0,他引:1  
为实现白酒品评自动化,采集了297个不同香型、86个不同等级、60个不同年份的白酒样品红外光谱图,共计443个。针对这些红外光谱图,采用3次多项式插值拟合的方法进行基线漂移校正,并用小波软阈值法去除光谱噪声,然后用标准归一化的方法消除散射效应。对于白酒的香型、等级和年份这3种不同的分类问题,分别选择样本的75%为训练集,余下25%为测试集,利用支持向量机(SVM)方法建立对应的香型、等级和年份分类模型,并在测试集上验证了模型的分类性能。实验结果表明该方法行之有效,香型分类正确率达到98%以上,等级分类正确率达到92%以上,年份分类正确率达到100%。  相似文献   

4.
为提高危险化学品被动红外遥测光谱鉴别正确率,提出应用支持向量机建立鉴别模型。利用野外实测氨气被动红外遥测光谱样本集,变换惩罚因子C对比高斯核函数与多项式核函数的效能,结合网格遍历法搜寻最佳模型参数,建立了基于支持向量机的鉴别模型。基于40个训练样本得到的模型,对包含267个样本的测试样本集的鉴别正确率可达93.6%,明显优于3层网络结构的BP神经网络鉴别模型。实验结果表明,支持向量机鉴别模型是一种有效的危险化学品红外遥测光谱鉴别方法。  相似文献   

5.
采用傅里叶变换红外光谱,测定了45个来自青海省不同产地的枸杞样品的红外光谱。小波变换对红外光谱原始数据进行了预处理。红外光谱数据压缩到原来的1/8,其分析精度与原始光谱数据基本相当。将45个样本数据分为30个训练集和15个测试集,建立随机森林(RF)预测枸杞产地模型,使用内部交叉验证和外部数据进行验证。采用R语言实现随机森林算法,并对模型的参数进行了优化。结果,所建立的判别模型中训练样本判别正确率为100%,测试样本判别正确率为100%。研究结果表明,建立的模型能够正确地对枸杞样品快速地进行产地鉴别,红外光谱法结合随机森林可作为中药材产域分类鉴别的一种新的现代化方法。  相似文献   

6.
改进残差网络在玉米叶片病害图像的分类研究   总被引:1,自引:0,他引:1       下载免费PDF全文
针对传统的玉米叶片病害图像识别方法正确率不高、速度慢等问题,提出一种基于改进深度残差网络模型的玉米叶片图像识别算法。提出的改进策略有:将传统的ResNet-50模型第一层卷积层中7×7卷积核替换为3个3×3的卷积核;使用LeakyReLU激活函数替代ReLU激活函数;改变残差块中批标准化层、激活函数与卷积层的排列顺序。进行数据预处理,将训练集与测试集的比例划分为4∶1,采用数据增强的方式对训练集进行扩充,将改进的ResNet-50模型经过迁移学习得到在ImageNet上预训练好的权重参数。实验结果表明,改进的网络在玉米叶片病害图像分类中得到了98.3%的正确率,与其他网络模型相比准确率大幅提升,鲁棒性进一步增强,可为玉米叶片病害的识别提供参考。  相似文献   

7.
蛋白质的功能常体现在生物大分子的相互作用中,识别蛋白质相互作用位点对于研究蛋白质功能发挥着重要作用.蛋白质问主要通过表面残基发生相互作用,蛋白质相互作用形成复合体时,只有部分表面残基参与了该过程.基于序列谱信息,提取序列上相邻残基的序列谱作为输入特征向量,对大小为3和7的残基信息窗(win3,win7),分别采用支持向量机(SVM)分类器对蛋白质相互作用位点进行预测、比较和分析.最终实验结果为:win3的平均正确率为69.31%,win7的平均正确率为69.68%.  相似文献   

8.
基于PSO算法的支持向量机核参数选择问题研究   总被引:2,自引:0,他引:2  
核函数中的参数选择是支持向量机中的一个非常重要的问题,它直接影响到模型的推广能力.本文提出了采用粒子群算法搜索支持向量机最优核参数的方法,并在Checker数据集上进行了实验,实验结果表明,通过这种方法选择出来的核参数能够提高分类正确率以及预测正确率,具有一定的实用性.  相似文献   

9.
蛋白质-蛋白质作用面上的结构特征对于研究蛋白质功能具有重要意义。提出了一种新的、基于统计直方图提取蛋白质作用面特征的方法,并且利用提取出的作用面特征,结合概率神经网络,实现了对作用面结构类型的分类预测。从预测结果来看,统计直方图提取出的特征,对蛋白质作用面结构具有很好的区分能力,而且可以通过调节划分的区间个数和节点的选取方式,达到对作用面结构的不同粒度的描述,以适用于不同目的的研究,这可能对与结构有关的某些生物信息学问题的研究具有启发性。利用概率神经网络对作用面结构进行分类预测,避开了费时的结构比对和数据库搜索,且训练快速,扩展能力强,正确率高,对独立测试集的911个蛋白复合物视在正确率达到90.67%。基于该算法的MATLAB分类器软件可以通过E-Mail与作者联系获取。  相似文献   

10.
对125个磺胺类碳酸酐酶Ⅱ抑制剂的生物活性进行了预测研究。利用ADRIANA.Code软件计算得到了化合物的一系列2D和3D结构描述符,从中选用了12个描述符进行建模。分别用数学随机划分的方法和Kohonen自组织神经网络的方法把数据集划分成两组不同的训练集和测试集。对于这两组不同的训练集和测试集,分别利用多元线性回归(MLR)和支持向量机(SVM)的方法进行建模,共得到4个模型。其中SVM得到的2个模型,训练集的相关系数在0.92以上,测试集预测的相关系数都在0.90以上。所有模型可进一步用于碳酸酐酶Ⅱ抑制剂的虚拟筛选。  相似文献   

11.
As many structures of protein–DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein–DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone.  相似文献   

12.
Accurate protein secondary structure prediction plays an important role in direct tertiary structure modeling, and can also significantly improve sequence analysis and sequence-structure threading for structure and function determination. Hence improving the accuracy of secondary structure prediction is essential for future developments throughout the field of protein research.In this article, we propose a mixed-modal support vector machine (SVM) method for predicting protein secondary structure. Using the evolutionary information contained in the physicochemical properties of each amino acid and a position-specific scoring matrix generated by a PSI-BLAST multiple sequence alignment as input for a mixed-modal SVM, secondary structure can be predicted at significantly increased accuracy. Using a Knowledge Discovery Theory based on the Inner Cognitive Mechanism (KDTICM) method, we have proposed a compound pyramid model, which is composed of three layers of intelligent interface that integrate a mixed-modal SVM (MMS) module, a modified Knowledge Discovery in Databases (KDD1) process, a mixed-modal back propagation neural network (MMBP) module and so on.Testing against data sets of non-redundant protein sequences returned values for the Q3 accuracy measure that ranged from 84.0% to 85.6%,while values for the SOV99 segment overlap measure ranged from 79.8% to 80.6%. When compared using a blind test dataset from the CASP8 meeting against currently available secondary structure prediction methods, our new approach shows superior accuracy.Availability: http://www.kdd.ustb.edu.cn/protein_Web/.  相似文献   

13.
为了提高蛋白质氧链糖基化位点的预测准确率,提出了把独立成分分析和支持向量机相结合的方法。实验样本(蛋白质序列)用稀疏编码方式编码,窗口长度为w=21,对于训练样本和待测样本,首先用独立成分分析法(ICA)提取了120个独立成分(特征),把这些独立成分作为支持向量机的输入,在特征空间用支持向量机(SVM)进行预测(分类)。实验结果表明,ICA+SVM的方法比PCA+SVM和SVM的好。预测准确率为88%。更进一步,用同一个蛋白质序列在不同窗口长度下的样本做实验,结果表明,窗口长度越长,预测准确率越高。  相似文献   

14.
In the post-genome period, the protein domain structures are published rapidly, but they have not been studied comprehensively. To figure out the cell function, the protein–DNA interactions decrypt the protein domain structures in recent research. Several machine-learning based methods are applied to the issue; however, they are not efficient to translate the tertiary structure characteristics of proteins into appropriate features for predicting the DNA-binding proteins. In this work, a novel machine-learning approach based on hidden Markov models identifies the characteristics of DNA-binding proteins with their amino acid sequences and tertiary structures. After we distill the features from DNA-binding proteins, a support vector machine based classifier predicts general DNA-binding proteins with the accuracy of 88.45 % through fivefolds cross-validation. Furthermore, we construct a response element specific classifier for predicting response element specific DNA-binding proteins, and the performance achieves the precision of 96.57 % with recall rate as 88.83 % in average. To verify the prediction of DNA-binding proteins, we used the DNA-binding proteins from MCF-7 that are likely to bind with estrogen response elements (ERE), and the results show that our methods can apply to practice.  相似文献   

15.
集成灰色支持向量机预测模型研究与应用   总被引:2,自引:1,他引:1  
林耀进  周忠眉  吴顺祥 《计算机应用》2009,29(12):3287-3289
对灰色预测GM(1,1)模型进行了分析,提出了集成灰色支持向量机的预测模型。分别对影响灰色预测GM(1,1)模型精度的背景值的计算、初值的选取以及数据序列的光滑度进行改进,提出了背景GM模型、初值GM模型、光滑度GM模型,并结合支持向量机的特点,将一维原始数据序列通过三个灰色模型得到的三组值作为支持向量机的输入,原始序列作为支持向量机的输出,训练得到最佳支持向量回归机模型。仿真结果表明了该模型的有效性。  相似文献   

16.
膜蛋白是一种具有重要生物功能的蛋白质,根据蛋白质的序列信息预测其是否属于β桶状跨膜蛋白是结构预测与功能分析的重要先导步骤,也是蛋白质预测领域中的一个挑战性问题。针对这两类问题,提取了208条β桶状跨膜蛋白序列的氨基酸位置与理化特征。利用支持向量机(SVM)进行了预测,结果表明二分类精度与相关系数分别达到了88.36%与0.7723。  相似文献   

17.
支持向量机回归模型是以预测噪声具有对称性概率分布为假设条件,而实际的短时交通流数据序列具有非平稳特征,这就使得在采用支持向量机回归模型进行短时交通流预测时,难以保证预测噪声的对称性概率分布,从而会影响到预测精度.针对上述问题,在证明支持向量机回归模型对平稳时间序列的预测噪声具有对称性概率分布的基础上,分别针对平稳化和未平稳化的短时交通流观测序列进行了仿真预测,并对预测结果进行了比对分析.分析结果表明,采用平稳化短时交通流预测方法可将预测的均方根误差降低约21.6%,绝对值误差降低约21.3%,相对误差降低约17.3%,仿真结果验证了所提方法的有效性.  相似文献   

18.
The prediction of secondary structure is an important topic in the field of bioinformatics, even if the methods have matured, and development of the algorithms is a far less active area than a decade ago. Accurate prediction is very useful to biologists in its own right, but it is worth pointing out that it is also an essential component of tertiary structure prediction, which in contrast is far from solved and continues to be a highly active area of research. In addition, sequence comparison methods have more recently incorporated local structure tracks. The extra information utilized by the new methods has led to considerable improvements in fold recognition and alignment accuracy. In this paper, a novel method for protein secondary structure prediction is presented. Using evolutionary information contained in amino acid’s physicochemical properties, position-specific scoring matrix generated by PSI-BLAST and HMMER3 profiles as input to hybrid back propagation system, secondary structure can be predicted at significantly increased accuracy. Based on knowledge discovery theory based on inner cognitive mechanism (KDTICM) theory, we have constructed a compound pyramid model approach, which is composed of four layers of the intelligent interface and integrated in several ways, such as hybrid back propagation method (HBP), modified knowledge discovery in databases (KDD*), hybrid SVM method (HSVM) and so on. Experiments on three standard datasets (RS126, CB513 and CASP8) show that CPM is capable of producing the higher Q 3 and SOV scores than that achieved by existing widely used schemes such as PSIPRED, PHD, Predator, as well as previously developed prediction methods. On the RS126 and CB513 datasets, it achieves a Q 3 and SOV99 score are considerably higher than the best reported scores, respectively. It is also tested on target proteins of critical assessment of protein structure prediction experiment (CASP8) and achieves better results than the traditional methods, including the popular PSIPRED method over overall prediction accuracy. Available: .  相似文献   

19.
预测蛋白质二级结构,是当今生物信息学中一个难以解决的问题。由于预测蛋白质二级结构的精度在蛋白 质结构研究中起到非常重要的作用,因此在基于KDTICM理论基础上,提出一种基于混合SVM方法的蛋白质二级 结构预测算法。该算法有效地利用蛋白质的物化属性和PSI-SEARCH生成的位置特异性打分矩阵作为双层SVM的 输入,从而大大地提高了蛋白质二级结构预测的精度。实验比较分析表明,新算法的预测精度和普适性明显优于目前 其他典型的预测方法。  相似文献   

20.
With the rapidly increasing pace of genome sequencing projects and the resulting flood of predicted amino acid sequences of uncharacterized proteins, protein sequence analysis, and in particular, protein structure prediction is quickly gaining in importance. Prediction algorithms can be used for preliminary annotation of newly sequenced proteins and, at least in some cases, provide insights into their function and specific mode of action. Such annotations for several microbial genomes were performed by several groups and placed in public domain for evaluation. An example presented in this work comes from a related project of structural and functional predictions for proteins involved in the process of controlled cell death (apoptosis). The BID protein belongs to an important class of regulators of apoptosis identified by short sequence motifs. Here, several fold prediction methods are used to build a series of three-dimensional models. Structure analysis of the models with reference to the biological data available allows selection of the most appropriate model. It is found that the most likely structural model of BID is built on the structure of Bcl-X(L). The model is discussed in terms of experimental data on specific proteolytic cleavage of BID and its effect on BID interactions with other proteins and membranes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号