首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
宋杰 《计算机应用研究》2009,26(11):4051-4053
利用一种新的核方法即核最近邻算法预测蛋白质相互作用,算法新颖、简洁,容易实现。实验结果表明,核最近邻算法的预测效果优于传统的最近邻算法及其他已有的预测方法,可以作为蛋白质相互作用预测的一个有效工具。  相似文献   

2.
凋谢蛋白亚细胞定位预测是研究凋谢蛋白生物功能的 1 种重要的方法,也是生物信息学研究的重要领域之一.提高凋谢蛋白亚细胞定位预测模型准确性和实用性是该研究的重点.在本研究中,提出了以模糊 K 近邻分类算法作为基础分类器的集成分类算法.以蛋白质序列内不同间隔的二肽组成表示基本的蛋白质序列的特征集合,采用二进制粒子群算法作为特征选择方法提取能够有效的蛋白质序列特征.这些经过特征选择后的蛋白质序列特征作为集成分类算法中每一个基础分类器的输入向量.经过在2个常用的数据集上使用 Jackknife 测试,本文算法在 C1317 数据集上取得了 91.5% 的预测准确率,在ZW225数据集上取得了88.0%的准确率.与前人报道的算法预测结果比较,本文方法取得了较好的准确率.与使用相同数据集的已经报道凋谢蛋白亚细胞定位预测算法相比,本研究方法取得了预测准确率.  相似文献   

3.
时间序列预测(TSP)在机器学习中是一个重要问题.论文提出了一种基于核密度估计(KDE)的集成增量学习方法,用于时间序列的预测问题.算法首先根据集成学习的原理产生基学习器池.然后用基学习器池对预测样本的输出值得到核密度估计,并用得到的核密度估计来剪枝基学习器池.得到最终的剪枝集成系统后,用该剪枝集成系统来预测样本的输出.最后,算法根据样本在动态选择集上筛选出的最近邻集合进行增量学习.在数据集IAP,ICS,MCD上的试验结果表明,提出的时间序列预测算法和当前流行的算法相比效果有一定程度的提高.  相似文献   

4.
改进嵌入维数和时间延迟计算的GP预测算法   总被引:1,自引:0,他引:1  
改进了混沌系统中的两个重要特征量:嵌入维数和时间延迟的计算,根据计算得出的上述两个参数重构相空间;然后在相空间中作轨迹的线性拟合,选择轨迹中的最近邻点作一次性的预测.提出的算法在相空间中很好地把轨迹的线性拟合与最近邻方法结合起来,解决了现有的时间序列分析和预测算法中主观性太强的缺点,通过对话务量时间序列和太阳黑子时间序列的验证,与其它算法相比,该算法的分析结果稳定而准确、预测精度高、运行时间比较短.  相似文献   

5.
在生物信息学中,对给定氨基酸序列的蛋白质进行分类,检测细微的蛋白质序列相似性或远同源性对于准确预测蛋白质功能和结构都非常重要。提出一种新的基于半监督支持向量机的远同源性检测方法,通过定义序列概率剖面,充分利用大型数据库的非标记数据,并行构筑支持向量机核函数,并结合最近邻分类器实现对任何数据的全覆盖。实验表明,该方法能够大幅提高蛋白质序列分类器的性能与效率。使用并行技术将总体计算时间控制在一定范围,推动了半监督支持向量机分类器的广泛应用。  相似文献   

6.
基于一级结构信息预测蛋白质热稳定性,对于利用计算机筛选热稳定性蛋白具有重要意义。本文采用k-近邻算法从序列出发预测蛋白质的热稳定性,用自一致性检验、交叉验证和独立样本测试等三种方法评估。仅用20种氨基酸组成作为特征变量时,识别的正确率分别可达100%,87.7%和89.6%;而引入8个新变量后,其精度分别为100%,89.6%和90.2%,对小蛋白质分子识别的精度提高了2.4%。同时探讨了蛋白质分子大小对识别效果的影响。  相似文献   

7.
核自适应滤波器(Kernel adaptive filter, KAF)是时间序列在线预测的重点研究领域之一, 本文对核自适应滤波器的最新进展及未来研究方向进行了分析和总结. 基于核自适应滤波器的时间序列在线预测方法, 能较好地解决预测、跟踪问题. 本文首先概述了三类核自适应滤波器的基本模型, 包括核最小均方算法、核递归最小二乘算法和核仿射投影算法(Kernel affine projection algorithm, KAPA). 在此基础上, 从核自适应滤波器在线预测的内容和机理入手, 综述基于核自适应滤波器的时间序列在线预测方法. 最后, 本文将介绍这一领域潜在的研究方向和发展趋势, 并展望未来的挑战.  相似文献   

8.
对于模式识别系统而言,不同的训练样本在建立分类模型时所起的作用不同,以往的蛋白质关联结构预测方法都是从样本集中随机选取一部分样本作为分类器的训练样本,这将降低蛋白质关联结构分类器的预测精度,为改善训练样本对预测精度的影响,本文提出一种基于样本选择及BP神经网络的蛋白质关联结构预测方法.该方法选取与蛋白质关联结构相关的属性进行编码,并采用样本选择技术从编码后的样本集中选取一定的高质量样本构建预测模型,从而有效地对蛋白质关联结构进行预测.本文根据提出的编码方式对从蛋白质数据库PDB中获取的200个蛋白质进行编码,然后用最近邻算法选择训练样本,并使用BP神经网络建立相应的预测模型.实验结果表明,进行训练样本选择能够有效提高蛋白质关联结构的预测精度.  相似文献   

9.
有效分析蛋白质家族是生物信息学的一项重要挑战,聚类成为解决这一问题的主要途径之一.基于传统序列比对方法定义蛋白质序列间相似关系时,假设了同源片断问的邻接保守性,与遗传重组相冲突.为更好地识别蛋白质家族,提出了一种蛋白质序列家族挖掘算法ProFaM.ProFaM首先采用前缀投影策略挖掘表征蛋白质序列的模式,然后基于模式及其权重信息构造相似度度量函数,并采用共享最近邻方法,实现了蛋白质序列家族聚类.解决了以往方法在蛋白质模式挖掘及相似度设计中的不足.在蛋白质家族数据库Pfam上的实验结果证实了ProFaM算法在蛋白质家族分析上有良好的结果.  相似文献   

10.
基于蛋白质的氨基酸组成,采用三种几何距离,即Euclidean 距离、Minkowski 距离和广义距离,利用最近邻算法对蛋白质亚细胞定位进行预测.结果表明该方法新颖、简单、有效.  相似文献   

11.
Analyses of the primary sequence of hemoglobin-binding protein HgbA from Actinobacillus pleuropneumoniae by comparative modelling and by a Hidden Markov Model identified its topological similarities to bacterial outer membrane receptors BtuB, FepA, FhuA, and FecA of Escherichia coli. The HgbA model has a globular N-terminal cork domain contained within a 22-stranded beta barrel domain, its folds being similar to the structures of outer membrane receptors that have been solved by X-ray crystallography. The barrel domain of the HgbA model superimposes onto the barrel domains of the four outer membrane receptors with rmsd values less than 1.0 A. This feature is consistent with a phylogenetic tree which indicated clustering of polypeptide sequences for three barrel domains. Furthermore, the HgbA model shares the highest structural similarity to BtuB, with the modelled HgbA barrel having approximately the same elliptical cross-section and height as that of BtuB. Extracellular loop regions of HgbA are predicted to be more extended than those of the E. coli outer membrane receptors, potentially facilitating a protein-protein interface with hemoglobin. Fold recognition modelling of the HgbA loop regions showed that 10 out of 11 predicted loops are highly homologous to known structures of protein loops that contribute to heme/iron or protein-protein interactions. Strikingly, HgbA loop 2 has structural homology to a loop in bovine endothelial nitric acid oxidase that is proximal to a heme-binding site; and HgbA loop 7 contains a histidine residue conserved in a motif that is involved in heme/hemoglobin interactions. These findings implicate HgbA loops 2 and 7 in recognition and binding of hemoglobin or the heme ligand.  相似文献   

12.
This paper is in the area of membrane proteins. Membrane proteins make up about 75% of possible targets for novel drugs discovery. However, membrane proteins are one of the most understudied groups of proteins in biochemical research because of technical difficulties of attaining structural information about transmembrane regions or domains. Structural determination of TM regions is an important priority in pharmaceutical industry, as it paves the way for structure based drug design.This research presents a novel evolutionary support vector machine (SVM) based alpha-helix transmembrane region prediction algorithm to solve the membrane helices in amino acid sequences. The SVM-genetic algorithm (GA) methodology is based on the optimisation of sliding window size, evolutionary encoding selection and SVM parameter optimisation. In this research average hydrophobicity and propensity based on skew statistics are used to encode the one letter representation of amino acid sequences datasets.The computer simulation results demonstrate that the proposed SVM-GA methodology performs better than most conventional techniques producing an accuracy of 86.71% for cross-validation and 86.43% for jack-knife for randomly selected proteins containing single and multiple transmembrane regions. Furthermore, for the amino acid sequence 3LVG, the proposed SVM-GA produces better alpha-helix region identification than PRED-TMR2, MEMSATSVM/MEMSAT3 and PSIPRED V3.0.  相似文献   

13.
The gastric pathogen Helicobacter pylori causes a spectrum of gastro-duodenal diseases, which may be mediated in part by the outer membrane vesicles (OMVs) constitutively shed by the pathogen. We aimed to determine the proteome of H. pylori OMV to help evaluate the mechanisms whereby these structures confer their known immuno-modulatory and cytotoxic activities to host cells, as such disease-associated activities are also conferred by the bacterium from which the vesicles are derived. We also evaluated the effect of the OMV on gastric/colonic epithelial cells, duodenal explants and neutrophils. A proteomic analysis of the OMV proteins separated by SDS-PAGE from two strains of H. pylori (J99 and NCTC 11637) was undertaken and 162 OMV-associated proteins were identified in J99 and 91 in NCTC 11637 by LC-MS/MS. The vesicles are rich in membrane proteins, porins, adhesins and several molecules known to modulate chemokine secretion, cell proliferation and other host cellular processes. Further, the OMVs are also vehicles for the carriage of the cytotoxin-associated gene A cytotoxin in addition to the previously documented toxin, vacuolating cytotoxin. Taken together, it is evident from the proteome of H. pylori OMV that these structures are equipped with the molecules required to interact with host cells in a manner not dissimilar from the intact pathogen.  相似文献   

14.
We describe a simple approach for finding identical amino acid clusters on the outer surface of α -helical coiled-coil proteins by examining the sequence of amino acids that compose the protein. Finding such similarities is an important immunological problem, since these may correspond to cross-reactive epitopes, i.e., sites at which antibodies produced against one protein also bind to another conformationally similar protein. Because of the regularities inherent in a coiled-coil structure the position of each amino acid on the structure is predicted. Based on this prediction, our algorithm finds similarities on the outer surface of the proteins. The matches found by our algorithm serve as an important screening process, intended to indicate which experiments to conduct to determine sites that correspond to cross-reactive epitopes. The location of several cross-reactive epitopes between M proteins and myosins had been verified experimentally. Although our approach makes many simplifying assumptions, these epitopes always correspond to clusters of identical amino acids, which our algorithm predicted to be contiguous on the outer surface. Our algorithm runs in O(n+m+r) time and O(n+m) space, where n and m are the lengths of the protein sequences, and r is the number of matching amino acids that appear in the same structural position of the α -helix in both sequences. Received June 7, 1997; revised March 23, 1998.  相似文献   

15.
Neisseria meningitidis, one of the principal causes of bacterial meningitis and septicemia, continues to present a challenge for vaccine developers. While significant progress has been made in the development and implementation of conjugate vaccines, which are based on the capsular polysaccharide of the organism, this approach has failed to produce a vaccine against organisms expressing a serogroup B capsule. The completion of the first meningococcal genome sequences in 2000 provided new ways of meeting this challenge. One approach has been to learn more about meningococcal biology and pathogenesis through exploring its proteome. This article reviews the results of ten recent studies of the meningococcal proteome and compares the different methodologies employed. Not surprisingly, given the renewed impetus to develop a comprehensive vaccine and the continuing clinical development of outer membrane vesicle vaccines, many of these studies focus on the proteome of the outer membrane fraction. As in other areas of proteome research, the direct comparison of data from different studies is hampered by the lack of standardization of separation technologies and data formats. Nevertheless, proteomic analysis, especially when combined with detailed knowledge of meningococcal population structures, represents a powerful tool in the development of vaccines against this important pathogen.  相似文献   

16.
This paper proposes a novel algorithm that characterizes the robust capture basin and the discriminating kernel for constrained nonlinear systems with uncertainties based on viability theory. For nonlinear systems with constrained inputs and bounded uncertainties, the viability kernel is the largest set of states possessing a possibility to be viable in a set, and the capture basin is the largest set of states possessing a possibility to reach a target in a finite time, and keeping viable in a set before reaching the target. However, in the viability theory, both control and uncertainty in a parameterized system are considered as parameters: the discriminating kernel and the proposed robust capture basin link viability theory with robust control, which take both control and uncertainties into account. For the constrained uncertain nonlinear systems, the discriminating kernel is the largest set of states that is robust invariant in a set with proper control, and the robust capture basin is the largest set of states reaching their target in finite time with proper control despite of uncertainties and keeping viable in a set before reaching the target. Furthermore, we map all the states to optimal regulatory control such that the systems are regulated by a regulation map. To compute the robust capture basin and the discriminating kernel, we use interval methods to provide guaranteed solutions. The proposed algorithms in this paper approximate an outer approximation of the minimum reachable target and inner approximations of the robust capture basin and the discriminating kernel in a guaranteed way.  相似文献   

17.
Spirochetes are a unique group of bacteria that include several motile and highly invasive pathogens that cause a multitude of acute and chronic disease processes. Nine genomes of spirochetes have been completed, which provide significant insights into pathogenic mechanisms of disease and reflect an often complex lifestyle associated with a wide range of environmental and host factors encountered during disease transmission and infection. Characterization of the outer membrane of spirochetes is of particular interest since it interacts directly with the host and environs during disease and likely contains candidate vaccinogens and diagnostics. In concert with appropriate fractionation techniques, the tools of proteomics have rapidly evolved to characterize the proteome of spirochetes. Of greater significance, studies have confirmed the differential expression of many proteins, including those of the outer membrane, in response to environmental signals encountered during disease transmission and infection. Characterization of the proteome in response to such signals provides novel insights to understand pathogenic mechanisms of spirochetes.  相似文献   

18.
Irregular spike sequences of the cerebral cortex in vivo have been observed in numerous previous studies. These spike sequences generally differ from an entirely random sequence, and exhibit temporal correlations. There are at least two possible sources producing the temporal correlations: (1) temporal correlations of the incoming synaptic inputs; (2) a neuronal integration mechanism. The temporal correlation of the neuronal output is produced by (1) or (2) or a mixture of these. In this article, we propose an algorithm that distinguishes the sources of the temporal correlations of spike sequences. The statistical characteristics of the spike sequences play a key role in this algorithm, which helps to classify the spike sequences by discriminating between their sources of temporal correlations. This work was presented in part at the 11th International Symposium on Artificial Life and Robotics, Oita, Japan, January 23–25, 2006  相似文献   

19.
众所周知,研究未知膜蛋白的类型可对基础研究和药物发现提供有用的线索。在后基因组时代,伴随着蛋白质序列数量的剧增,用实验方法确定膜蛋白类型太过昂贵和费时。因此,研究出一种能够自动发现可能的膜蛋白的计算方法变得很重要。鉴于这种情况,曾有人采用DC(Dipeptide Composition)方法表示蛋白质序列并取得了很好的预测结果。然而,采用这种表示方法得到的特征维数很高,冗余很大,使得预测系统十分复杂。为了解决这个问题,本文采用非线性降维算法KPCA(Kernel Principle component analysis),通过从高维的DC(Dipeptide Composition)特征空间中提取出低维的重要特征来简化该系统,采用K-NN(K-nearest neighbor)分类器从约简后的低维特征中预测膜蛋白类型。实验结果表明,使用KPCA方法预测膜蛋白类型非常有效。  相似文献   

20.
A theoretical development of a novel approach for target tracking based on multiple patterns extracted from measurement sequences is presented in this paper. The introduction of patterns leads to a new paradigm for developing high performance algorithms. An interacting multi-pattern probabilistic data association (IMP-PDA) algorithm is developed, taking the advantage of clever formulation of the interacting multiple model approach. The IMP-PDA algorithm employs distance, directional and maneuver information for data association, which enhances significantly the capability of discriminating correct measurements from false measurements  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号