首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
传统蛋白质二级结构预测,由于氨基酸序列中三种结构数量的差异,易造成不均衡训练,使得对三种结构的预测准确率差别较大。为改善这种缺陷,受装袋原理的启发,对传统方法进行改进,缩小训练时三种结构数量的差距。在实验中,采用数据集CB396,结果表明该方法能够显著提高对折叠的预测正确率,而且在总的预测正确率上达到77.3%,可以较好地进行蛋白质二级结构预测。  相似文献   

2.
The problem of protein secondary structure prediction is one of the most important problems in Bioinformatics. After the study of this problem for 30 years and more, there have been some breakthroughs. Especially, the introduction of ensemble prediction model and hybrid prediction model makes the accuracy of prediction better, but there is a long distance to induce the tertiary structures from the secondary ones. As one of the extension researches of KDTICM [Bingru, Yang (2004). Knowledge discovery based on theory of inner cognition mechanism and application. Beijing: Electronic Industry Press] theory, this paper proposed a method KAAPRO, which is based on Maradbcm algorithm which is induced by KDD1 model and combined with CBA, for protein secondary structure prediction. And a gradually enhanced, multi-layer systematic prediction model, compound pyramid model, is proposed. The kernel of this model is KAAPRO. Domain knowledge is used through the whole model, and the physical–chemical attributes are chosen by causal cellular automata. In the experiment, the test proteins used in reference Muggleton et al. (Muggleton, S. H., King, R., Sternberg, M. (1992). Protein secondary structure prediction using logic-based machine learning. Protein Engineering, 5(7), 647–657) are predicted. The structures of amino acids, whose structural traits are obscure, are predicted well by KAAPRO. Hence, the result of this model is satisfying too.  相似文献   

3.
杨炳儒  周谆  侯伟 《计算机应用研究》2009,26(12):4617-4620
蛋白质二级结构预测问题,是生物信息学领域中最为重要的任务之一,历经三十多年的研究,已取得了一些进展,尤其是近来集成预测模型与混合预测模型的引入,为预测精度带来了一定程度的提高,然而其离从二级结构推导三级结构的目标,仍然存在很大差距。为了有效提高蛋白质二级结构预测精度,以KDTICM理论的扩展性研究与KDD*模型为基础, 使用基于KDD*模型的关联分析蛋白质二级结构预测方法KAAPRO,提出一种基于支持度与可信度的复杂距离度量的CBA(classification based on association)  相似文献   

4.
基于神经网络集成的蛋白质二级结构预测模型   总被引:2,自引:3,他引:2  
为了提高蛋白质二级结构预测精度,本文尝试采用一种基于串联BP网络集成的二级结构预测模型。首先根据二级结构是由其一级序列决定以及神经网络输出之间具有相关性,采用串联BP作为集成的子网络分类器,在训练过程中采用“剪枝法”和“早停”来防止过拟合。其次为增加网络的差异度,利用bagging方法对样本重采样并加入随机噪声。把单独训练的具有一定差异度的5个子网络利用相对多数“投票规则”进行整合。以Rs126中的90个蛋白质共15 377个氨基酸进行10倍率交叉验证,仿真结果表明此网络集成可以较好地对二级结构进行分类。  相似文献   

5.
蛋白质结构与功能一直是生命科学的研究重点.尽管蛋白质二级结构的预测已得到广泛的应用,但其预测的精度一直受到算法的制约.在本文中,采用复合编码代替传统的氨基酸编码方式,结合氨基酸疏水性对蛋白质结构的影响,提出一种新的支持向量机算法.使用7倍交叉验证表明,本算法提高了二级蛋白质结构预测的准确性,并节约了计算资源.  相似文献   

6.
A current development trend in research on intelligent systems is to optimize a general intelligent prediction system into an individuation intelligent prediction system that is applied in specialized fields. Protein structure prediction is a challenging international issue. In this paper, we propose a new intelligent prediction system model, designed as a multi-layer compound pyramid model, for predicting secondary protein structure. The model comprises four independent intelligent interfaces and several knowledge discovery methods. The model penetrates throughout the domain knowledge, with the effective attributes chosen by Causal Cellular Automata. Furthermore, a high pure structure database is constructed for training. On the RS126 dataset, the overall state per-residue accuracy, Q3, reached 83.99%, while on the CB513 dataset, Q3 reached 85.58%. Meanwhile, on the CASP8 sequences, the results are superior to those produced by other methods, such as Psipred, Jpred, APSSP2 and BehairPred. These results confirm that our method has a strong generalization ability, and that it provides a model for the construction of other intelligent systems.  相似文献   

7.
8.
The paper discusses numerical results of predicting protein secondary structure using Bayesian classification procedures based on nonstationary Markovian chains. A new approach is used, based on the classification of pairs of states for pairs of neighboring amino acids. It improves the prediction accuracy as compared with that of the classification of the state of one amino acid. __________ Translated from Kibernetika i Sistemnyi Analiz, No. 2, pp. 59–64, March–April 2007.  相似文献   

9.
The prediction of secondary structure is an important topic in the field of bioinformatics, even if the methods have matured, and development of the algorithms is a far less active area than a decade ago. Accurate prediction is very useful to biologists in its own right, but it is worth pointing out that it is also an essential component of tertiary structure prediction, which in contrast is far from solved and continues to be a highly active area of research. In addition, sequence comparison methods have more recently incorporated local structure tracks. The extra information utilized by the new methods has led to considerable improvements in fold recognition and alignment accuracy. In this paper, a novel method for protein secondary structure prediction is presented. Using evolutionary information contained in amino acid’s physicochemical properties, position-specific scoring matrix generated by PSI-BLAST and HMMER3 profiles as input to hybrid back propagation system, secondary structure can be predicted at significantly increased accuracy. Based on knowledge discovery theory based on inner cognitive mechanism (KDTICM) theory, we have constructed a compound pyramid model approach, which is composed of four layers of the intelligent interface and integrated in several ways, such as hybrid back propagation method (HBP), modified knowledge discovery in databases (KDD*), hybrid SVM method (HSVM) and so on. Experiments on three standard datasets (RS126, CB513 and CASP8) show that CPM is capable of producing the higher Q 3 and SOV scores than that achieved by existing widely used schemes such as PSIPRED, PHD, Predator, as well as previously developed prediction methods. On the RS126 and CB513 datasets, it achieves a Q 3 and SOV99 score are considerably higher than the best reported scores, respectively. It is also tested on target proteins of critical assessment of protein structure prediction experiment (CASP8) and achieves better results than the traditional methods, including the popular PSIPRED method over overall prediction accuracy. Available: .  相似文献   

10.
Accurately predicting fabricating cost in a timely manner can enhance corporate competitiveness. This study employs the Evolutionary Support Vector Machine Inference Model (ESIM) to predict the cost of manufacturing thin-film transistor liquid–crystal display (TFT-LCD) equipment. The ESIM is a hybrid model integrating a support vector machine (SVM) with a fast messy genetic algorithm (fmGA). The SVM concerns primarily with learning and curve fitting, while the fmGA is focuses on optimization of minimal errors. Recently completed equipment development projects are utilized to assess prediction performance. The ESIM is developed to achieve the fittest C and γ parameters with minimized prediction error when used for cost estimate during conceptual stages. This study describes an actionable knowledge-discovery process using real-world data for high-tech equipment manufacturing industries. Analytical results demonstrate that the ESIM can predict the costs of manufacturing TFT-LCD fabrication equipment with sufficient accuracy.  相似文献   

11.
In this paper, the neural network method was applied to predict the content of protein secondary structure elements that was based on 'pair-coupled amino acid composition', in which the sequence coupling effects are explicitly included through a series of conditional probability elements. The prediction was examined by a self-consistency test and an independent-dataset. Both indicated good results obtained when using the neural network method to predict the contents of alpha-helix, beta-sheet, parallel beta-sheet strand, antiparallel beta-sheet strand, beta-bridge, 3(10)-helix, pi-helix, H-bonded turn, bend, and random coil.  相似文献   

12.
应用ANN/HMM混合模型预测蛋白质二级结构   总被引:1,自引:1,他引:0  
针对3状态隐马尔可夫模型(hidden Markov model,HMM)预测蛋白质二级结构准确率不高的问题,提出15状态HMM,通过改进的算法与BP神经网络相结合进行二级结构预测。研究对象为CB513数据集中筛选出的492条蛋白质序列,将其随机均分7组。应用混合模型进行预测,对准确率进行7交叉验证,Q3准确率达7721%,SOV值为7252%。结果表明,混合模型既能充分考虑相邻氨基酸残基间的相互影响,也能在一定程度上照顾二级结构的远程相关性,因此带来了较好的预测准确率。  相似文献   

13.
利用SVM进行车型识别   总被引:5,自引:0,他引:5  
为了提高车辆图像的识别率,提出了利用支持向量机(SVM)理论进行轿车车型识别方法.SVM能够解决线性及非线性分类问题,以较少的支持向量确定分类面,对样本数量及维数不敏感.基于颜色直方图及惯性比确定的图像特征具有平移、旋转和尺度不变性,可以用来确定SVM的最优分类面,并由此识别车型.  相似文献   

14.
在分析了频域相位信息和纹理信息在表征图像特征方面的重要性之后,提出了一种结合相位一致和纹理特征的SVM图像分割方法。该方法将相位一致性统计特征、纹理特征和灰度特征一起组合成训练特征向量,采用支持向量机分类方法对图像进行分割。相对于传统方法,该方法提取的统计特征向量可以有效地反映图像边缘细节和纹理信息。实验结果表明,该方法比传统的SVM图像分割方法更有效,尤其适用于图像中目标区域的边缘对比度低和纹理信息丰富的情形。  相似文献   

15.
为解决边缘点与非边缘点过渡的模糊边缘,提出了一种模糊支持向量机的边缘检测算法。该算法选用图像3 3窗口4个方向的灰度梯度、梯度幅值和梯度方向组成6维特征向量,同时选用径向机核函数对样本特征向量升维到高维空间,在高维空间中构造最优分类超平面。同时,根据归一化后的梯度幅值来确定每个样本的隶属度,最后利用模糊支持向量机实现边缘检测。实验结果表明了模糊支持向量机边缘检测方法的可行性。  相似文献   

16.
Depth from focus using a pyramid architecture   总被引:1,自引:0,他引:1  
A method is presented for depth recovery through the analysis of scene sharpness across changing focus position. Modeling a defocused image as the application of a low pass filter on a properly focused image of the same scene, we can compare the high spatial frequency content of regions in each image and determine the correct focus position. Recovering depth in this manner is inherently a local operation, and can be done efficiently using a pipelined image processor. Laplacian and Gaussian pyramids are used to calculate sharpness maps which are collected and compared to find the focus position that maximizes high spatial frequencies for each region.  相似文献   

17.
Classification is an essential task in data mining, machine learning and pattern recognition areas. Conventional classification models focus on distinctive samples from different categories. There are fine-grained differences between data instances within a particular category. These differences form the preference information that is essential for human learning, and, in our view, could also be helpful for classification models. In this paper, we propose a preference-enhanced support vector machine (PSVM), that incorporates preference-pair data as a specific type of supplementary information into SVM. Additionally, we propose a two-layer heuristic sampling method to obtain effective preference-pairs, and an extended sequential minimal optimization (SMO) algorithm to fit PSVM. To evaluate our model, we use the task of knowledge base acceleration-cumulative citation recommendation (KBA-CCR) on the TREC-KBA-2012 dataset and seven other datasets from UCI, StatLib and mldata.org. The experimental results show that our proposed PSVM exhibits high performance with official evaluation metrics.  相似文献   

18.
An image segmentation algorithm based on multi-resolution processing is presented. The algorithm is based on applying a local clustering at each level of a linked pyramid data structure allowing seed nodes to be defined. These seed nodes are the root nodes of regions at the base of the pyramid, appearing in the multi-resolution data structure at a level appropriate to the region size. By applying a merging process followed by a classification step, accurate segmentations are obtained for both natural and synthetic images without the need for a priori knowledge. Results show that the algorithm gives accurate segmentations even in low signal to noise ratios.  相似文献   

19.
《Pattern recognition letters》2001,22(3-4):373-379
Vector quantization (VQ) is a well-known data compression technique. In the codebook design phase as well as the encoding phase, given a block represented as a vector, searching the closest codeword in the codebook is a time-consuming task. Based on the mean pyramid structure and the range search approach, an improved search algorithm for VQ is presented in this paper. Conceptually, the proposed algorithm has the bandpass filter effect. Each time, using the derived formula, the search range becomes narrower due to the elimination of some portion of the previous search range. This reduces search times and improves the previous result by Lee and Chen (A fast search algorithm for vector quantization using mean pyramids of codewords. IEEE Trans. Commun. 43(2/3/4), (1995) 1697–1702). Some experimental results demonstrate the computational advantage of the proposed algorithm.  相似文献   

20.
为了解决径向基网络(RBF NN)结构设计的随机性,进一步优化RBF网络性能,提出一种基于支持向量机(SVM)的径向基网络结构优化方法。通过训练得到的SVM确定径向基网络的隐层节点个数、隐层权值和阈值;同时利用SVM对输入向量进行特征变换,进一步对输入向量进行维数约简。通过齿轮箱的故障诊断实验表明,优化后的RBF网络具有更精简、稳定的网络结构,能得到更准确的诊断结果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号