首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
用于T细胞表位预测的分类器集成方法*   总被引:1,自引:1,他引:0  
T细胞表位预测技术对于减少实验合成重叠肽,理解T细胞介导的免疫特异性和研制亚单位多肽及基因疫苗均有重要意义.为弥补已有基于机器学习方法的T细胞表位预测模型的可理解性的不足并进一步提高模型的预测精度,首先通过肽的预处理构建出了存储等长肽段的决策表,而后提出了基于粗糙集的分类器集成算法.该算法不但综合利用了基于信息熵的属性约简完备算法和其他属性约简算法的优势,而且将T细胞表位预测领域中的锚点知识融入到了属性值约简过程中.最后利用该算法来预测MHC Ⅱ类分子HLA-DR4(B1·0401)的结合肽,首次提取出了预测精度高且能帮助专家理解MHC分子与抗原肽的结合机理的产生式规则,为下一步的分子建模工作奠定了基础.  相似文献   

2.
曾安  潘丹  郑启伦  彭宏 《计算机科学》2007,34(6):226-230
T细胞表位预测技术对于减少实验合成重叠肽、研究病原体与机体作用的免疫机制以及深入理解T细胞介导的免疫特异性均有重要意义。为增强T细胞表位预测模型的可理解性,本文在通过肽的预处理构建出存储等长肽段的决策表之后,设计出了一种基于粗集的T细胞表位预测方法。该方法由基于信息熵的属性约简完备算法和基于锚点知识的属性值顺序约简改进算法共同组成。基于HLA-DR4(B10401)编码的MHCII类分子结合肽的实验数据表明,在预测精度与传统神经网络方法大致相当的基础上,本文方法可以提取出用于帮助专家理解MHC分子与抗原肽结合机理的产生式规则。  相似文献   

3.
Strategies for selecting informative data points for training prediction algorithms are important, particularly when data points are difficult and costly to obtain. A Query by Committee (QBC) training strategy for selecting new data points uses the disagreement between a committee of different algorithms to suggest new data points, which most rationally complement existing data, that is, they are the most informative data points. In order to evaluate this QBC approach on a real-world problem, we compared strategies for selecting new data points. We trained neural network algorithms to obtain methods to predict the binding affinity of peptides binding to the MHC class I molecule, HLA-A2. We show that the QBC strategy leads to a higher performance than a baseline strategy where new data points are selected at random from a pool of available data. Most peptides bind HLA-A2 with a low affinity, and as expected using a strategy of selecting peptides that are predicted to have high binding affinities also lead to more accurate predictors than the base line strategy. The QBC value is shown to correlate with the measured binding affinity. This demonstrates that the different predictors can easily learn if a peptide will fail to bind, but often conflict in predicting if a peptide binds. Using a carefully constructed computational setup, we demonstrate that selecting peptides with a high QBC performs better than low QBC peptides independently from binding affinity. When predictors are trained on a very limited set of data they cannot be expected to disagree in a meaningful way and we find a data limit below which the QBC strategy fails. Finally, it should be noted that data selection strategies similar to those used here might be of use in other settings in which generation of more data is a costly process.  相似文献   

4.
With its implications for vaccine discovery, the accurate prediction of T cell epitopes is one of the key aspirations of computational vaccinology. We have developed a robust multivariate statistical method, based on partial least squares, for the quantitative prediction of peptide binding to major histocompatibility complexes (MHC), the principal checkpoint on the antigen presentation pathway. As a service to the immunobiology community, we have made a Perl implementation of the method available via a World Wide Web server. We call this server MHCPred. Access to the server is freely available from the URL: http://www.jenner.ac.uk/MHCPred. We have exemplified our method with a model for peptides binding to the common human MHC molecule HLA-B*3501.  相似文献   

5.
6.
Peptide-major histocompatibility complex (MHC) binding is an important prerequisite event and has immediate consequences to immune response. Those peptides binding to MHC molecules can activate the T-cell immunity, and they are useful for understanding the immune mechanism and developing vaccines for diseases. Recently, researchers are interested in making prediction about binding affinity instead of differentiating the peptides as binder or non-binder. In this paper, we use sparse Bayesian regression algorithm proposed by Tipping [M.E. Tipping, Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. (2001)] to derive position-specific scoring matrices from allele-related peptides, and develop the models allowing for the prediction of MHC-II binding affinity. We explore the peptide length and peptide flanking residue length's impact on binding affinity, and incorporate these factors into our models to enhance prediction performance. When applied to the datasets from AntiJen database and IEDB database, our method produces better performances than several popular quantitative methods.  相似文献   

7.
Peptides that induce and recall T-cell responses are called T-cell epitopes. T-cell epitopes may be useful in a subunit vaccine against malaria. Computer models that simulate peptide binding to MHC are useful for selecting candidate T-cell epitopes since they minimize the number of experiments required for their identification. We applied a combination of computational and immunological strategies to select candidate T-cell epitopes. A total of 86 experimental binding assays were performed in three rounds of identification of HLA-A11 binding peptides from the six preerythrocytic malaria antigens. Thirty-six peptides were experimentally confirmed as binders. We show that the cyclical refinement of the ANN models results in a significant improvement of the efficiency of identifying potential T-cell epitopes.  相似文献   

8.
HLA class I molecules present peptides on the cell surface to CD8(+) T cells. The repertoire of peptides that associate to class I molecules represents the cellular proteome. Therefore, cells expressing different proteomes could generate different class I-associated peptide repertoires. A large number of peptides have been sequenced from HLA class I alleles, mostly from lymphoid cells. On the other hand, T cell immunotherapy is a goal in the fight against cancer, but the identification of T cell epitopes is a laborious task. Proteomic techniques allow the definition of putative T cell epitopes by the identification of HLA natural ligands in tumor cells. In this study, we have compared the HLA class I-associated peptide repertoire from the hepatocellular carcinoma (HCC) cell line SK-Hep-1 with that previously described from lymphoid cells. The analysis of the peptide pool confirmed that, as expected, the peptides from SK-Hep-1 derive from proteins localized in the same compartments as in lymphoid cells. Within this pool, we have identified 12 HLA class I peptides derived from HCC-related proteins. This confirms that tumor cell lines could be a good source of tumor associated antigens to be used, together with MS, to define putative epitopes for cytotoxic T cells from cancer patients.  相似文献   

9.
Peptide-MHC binding is an important prerequisite event and has immediate consequences to immune response. Those peptides binding to MHC molecules can activate the T-cell immunity, and they are useful for understanding the immune mechanism and developing vaccines for diseases. Accurate prediction of the binding between peptides and MHC-II molecules has long been a challenge in bioinformatics. Recently, instead of differentiating peptides as binder or non-binder, researchers are more interested in making predictions directly on peptide binding affinities. In this paper, we investigate the use of relevance vector machine to quantitatively predict the binding affinities between MHC-II molecules and peptides. In our scheme, a new encoding scheme is used to generate the input vectors, and then by using relevance vector machine we develop the prediction models on the basis of binding cores, which are recognized in an iterative self-consistent way. When applied to three MHC-II molecules DRB1*0101, DRB1*0401 and DRB1*1501, our method produces consistently better performance than several popular quantitative methods, in terms of cross-validated squared error, cross-validated correlation coefficient, and area under ROC curve. All evidences indicate that our method is an effective tool for MHC-II binding affinity prediction.  相似文献   

10.
MHC II类分子结合肽的预测对于免疫研究和疫苗设计非常重要,然而其结合肽长度的可变性等原因使其预测变得极为困难,提出了一种基于广义选择性神经网络集成的MHC II分子结合肽预测算法,该算法是一种双层集成模型。第一层是用微分进化算法去生成初始神经网络集成池,第二层是从初始神经网络集成池中选择部分组成最终的神经网络集成。实验结果表明广义选择性神经网络集成比传统的选择性神经网络有更好的泛化性能。  相似文献   

11.
Human pituitary tumor-transforming gene (PTTG) plays an essential role in the development and progression of pediatric acute lymphoblastic leukemia (pALL). PTTG has two SH3-binding peptide motifs that can be recognized by a variety of SH3-containing proteins in the pALL through peptide-mediated interactions. In this study, the gene expression profile of pALL was examined in detail by integrating computational modeling and experimental assay, aiming to identify those potential partner proteins of human PTTG. The binding potency of domain candidates to peptide motifs was ranked using knowledge-based scoring and fluorescence titration. A number of SH3 domains found in a variety of pALL proteins were identified as potent binders with moderate or high affinity for PTTG. It is revealed that the PTTG peptide motifs show different affinity profiles for various candidate proteins, indicating that the PTTG selectivity is optimized across pALL gene expression space. The PTTG peptides were then mutated rationally to target the SH3 domains of identified partner proteins by competing with the native peptide motifs.  相似文献   

12.
The insulin-like growth factor-1 receptor (IGF-1R) plays a key role in proliferation, growth, differentiation, and development of several human malignancies including breast and pancreatic adenocarcinoma. IGF-1R targeted immunotherapeutic approaches are particularly attractive, as they may potentially elicit even stronger antitumor responses than traditional targeted approaches. Cancer peptide vaccines can produce immunologic responses against cancer cells by triggering helper T cell (Th) or cytotoxic T cells (CTL) in association with Major Histocompatibility Complex (MHC) class I or II molecules on the cell surface of antigen presenting cells. In our previous study, we set a technique based on molecular docking in order to find the best MHC class I and II binder peptides using GOLD. In the present work, molecular docking analyses on a library consisting of 30 peptides mimicking discontinuous epitopes from IGF-1R extracellular domain identified peptides 249 and 86, as the best MHC binder peptides to both MHC class I and II molecules. The receptors most often targeted by peptide 249 are HLA-DR4, HLA-DR3 and HLA-DR2 and those most often targeted by peptide 86 are HLA-DR4, HLA-DP2 and HLA-DR3. These findings, based on bioinformatics analyses, can be conducted in further experimental analyses in cancer therapy and vaccine design.  相似文献   

13.
14.
Human islet amyloid polypeptide (hIAPP) is a natively unfolded polypeptide hormone of glucose metabolism, which is co-secreted with insulin by the β-cells of the pancreas. In patients with type 2 diabetes, IAPP forms amyloid fibrils because of diabetes-associated β-cells dysfunction and increasing fibrillation, in turn, lead to failure of secretory function of β-cells. This provides a target for the discovery of small organic molecules against protein aggregation diseases. However, the binding mechanism of these molecules with monomers, oligomers and fibrils to inhibit fibrillation is still an open question. In this work, ligand and structure-based in silico approaches were used to identify novel fibrillation inhibitors and/or fibril binding compounds. The best pharmacophore model was used as a 3D search query for virtual screening of a compound database to identify novel molecules having the potential to be therapeutic agents against protein aggregation diseases. Docking and molecular dynamics simulation studies were used to explore the interaction pattern and mechanism of the identified novel small molecules with predicted hIAPP structure, its aggregation prone conformation and fibril forming segments. We show that catechins with galloyl group and molecules having two to three planar apolar rings bind to hIAPP structures and fibril forming segments with greater affinity. The differences in binding affinities of different compounds against several fibril forming segments of the peptide suggest that a mixture of active compounds may be required for treatment of aggregation diseases.  相似文献   

15.
Research on peptide classification problems has focused mainly on the study of different encodings and the application of several classification algorithms to achieve improved prediction accuracies. The main drawback of the literature is the lack of an extensive comparison among the available encoding methods on a wide range of classification problems. This paper addresses the fundamental issue of which peptide encoding promises the best results for machine learning classifiers. Two novel encoding methods based on physicochemical properties of the amino acids are proposed and an extensive comparison with several standard encoding methods is performed on three different classification problems (HIV-protease, recognition of T-cell epitopes and prediction of peptides that bind human leukocyte antigens). The experimental results demonstrate the effectiveness of the new encodings and show that the frequently used orthonormal encoding is inferior compared to other methods.  相似文献   

16.
Determination of potential drug toxicity and side effect in early stages of drug development is important in reducing the cost and time of drug discovery. In this work, we explore a computer method for predicting potential toxicity and side effect protein targets of a small molecule. A ligand-protein inverse docking approach is used for computer-automated search of a protein cavity database to identify protein targets. This database is developed from protein 3D structures in the protein data bank (PDB). Docking is conducted by a procedure involving multiple conformer shape-matching alignment of a molecule to a cavity followed by molecular-mechanics torsion optimization and energy minimization on both the molecule and the protein residues at the binding region. Potential protein targets are selected by evaluation of molecular mechanics energy and, while applicable, further analysis of its binding competitiveness against other ligands that bind to the same receptor site in at least one PDB entry. Our results on several drugs show that 83% of the experimentally known toxicity and side effect targets for these drugs are predicted. The computer search successfully predicted 38 and missed five experimentally confirmed or implicated protein targets with available structure and in which binding involves no covalent bond. There are additional 30 predicted targets yet to be validated experimentally. Application of this computer approach can potentially facilitate the prediction of toxicity and side effect of a drug or drug lead.  相似文献   

17.
Peptide binding to Major Histocompatibility Complex (MHC) is a prerequisite for any T cell-mediated immune response. Predicting which peptides can bind to a specific MHC molecule is indispensable to minimizing the number of peptides required to synthesize, to the development of vaccines and immunotherapy of cancer, and to aiding to understand the specificity of T-cell mediated immunity. At present, although predictions based on machine learning methods have good prediction performance, they cannot acquire understandable knowledge and prediction performance can be further improved. Thereupon, the Rule Sets ENsemble (RSEN) algorithm, which takes advantage of diverse attribute and attribute value reduction algorithms based on rough set (RS) theory, is proposed as the initial trial to acquire understandable rules along with enhancement of prediction performance. Finally, the RSEN is applied to predict the peptides that bind to HLA-DR4(B1* 0401). Experimentation results show: (1) prepositional rules for predicting the peptides that bind to HLA-DR4 (B1* 0401) are obtained; (2) compared with individual RS-based algorithms, the RSEN has a significant decrease (13%–38%) in prediction error rate; (3) compared with the Back-Propagation Neural Networks (BPNN), prediction error rate of the RSEN decreases by 4%–16%. The acquired rules have been applied to help experts make molecules modeling. An Zeng received the Ph.D. degree in computer applications technology from South China University of Technology in 2005. Nowadays she is a lecturer at the Faculty of Computer of Guangdong University of Technology. Her research interests are data mining, bioinformatics, neural networks, artificial intelligence, and computational immunology. In these areas she has published over 20 technical papers in various prestigious journals or conference proceedings. She is a member of the IEEE. Contact her at the Faculty of Computer, Guangdong Univ. of Technology, University Town, PanYu District, Guangzhou, 510006, P.R. China. Dan Pan received the Ph.D. degree in circuits and systems from South China University of Technology in 2001. He is a senior engineer in Guangdong Mobile Communication Co. Ltd at present. His research interests are data mining, machine learning, bioinformatics, and data warehousing, and applications of business modeling and software engineering to computer-aided business operations systems, especially in the telecom industry. In these areas he has published over 30 technical papers in refereed journals or conference proceedings. As a member of the International Association of Science and Technology for Development (IASTED) technical committee on artificial intelligence and expert systems, he served a number of conferences and publications. He is a member of the IEEE. Contact him at Guangdong Mobile Communication Co. Ltd., 208 Yuexiu South Rd., Guangzhou, 510100, P.R. China. Jian-bin He received the M.E. in computer science from South China University of Technology in 2002. He now is a data mining consultant at Teradata division of NCR (China), supporting telecom carriers to do data mining in data warehouses for market research. His research interests include statistical learning, semi-supervised learning, spectral clustering, multi-relational data mining and their application to social science. Contact him at NCR(China) Co. Ltd., Unit 2306, Tower B, Center Plaza, 161 Linhexi Road, Guangzhou, 510620, P.R. China.  相似文献   

18.
We have designed small focused combinatorial library of hexapeptide inhibitors of NS3 serine protease of the hepatitis C virus (HCV) by structure-based molecular design complemented by combinatorial optimisation of the individual residues. Rational residue substitutions were guided by the structure and properties of the binding pockets of the enzyme's active site. The inhibitors were derived from peptides known to inhibit the NS3 serine protease by using unusual amino acids and alpha-ketocysteine or difluoroaminobutyric acid, which are known to bind to the S1 pocket of the catalytic site. Inhibition constants (Ki) of the designed library of inhibitors were predicted from a QSAR model that correlated experimental Ki of known peptidic inhibitors of NS3 with the enthalpies of enzyme-inhibitor interaction computed via molecular mechanics and the solvent effect contribution to the binding affinity derived from the continuum model of solvation. The library of the optimised inhibitors contains promising drug candidates-water-soluble anionic hexapeptides with predicted Ki* in the picomolar range.  相似文献   

19.
Biochemical research often involves examining structural relationships in molecules since scientists strongly believe in the causal relationship between structure and function. Traditionally, researchers have identified these patterns, or motifs, manually using domain expertise. However, with the massive influx of new biochemical data and the ability to gather data for very large molecules, there is great need for techniques that automatically and efficiently identify commonly occurring structural patterns in molecules. Previous automated substructure discovery approaches have each introduced variations of similar underlying techniques and have embedded domain knowledge. While doing so improves performance for the particular domain, this complicates extensibility to other domains. Also, they do not address scalability or noise, which is critical for macromolecules such as proteins. In this paper, we present MotifMiner, a general framework for efficiently identifying common motifs in most scientific molecular datasets. The approach combines structure-based frequent-pattern discovery with search space reduction and coordinate noise handling. We describe both the framework and several algorithms as well as demonstrate the flexibility of our system by analyzing protein and drug biochemical datasets.  相似文献   

20.
We describe a simple approach for finding identical amino acid clusters on the outer surface of α -helical coiled-coil proteins by examining the sequence of amino acids that compose the protein. Finding such similarities is an important immunological problem, since these may correspond to cross-reactive epitopes, i.e., sites at which antibodies produced against one protein also bind to another conformationally similar protein. Because of the regularities inherent in a coiled-coil structure the position of each amino acid on the structure is predicted. Based on this prediction, our algorithm finds similarities on the outer surface of the proteins. The matches found by our algorithm serve as an important screening process, intended to indicate which experiments to conduct to determine sites that correspond to cross-reactive epitopes. The location of several cross-reactive epitopes between M proteins and myosins had been verified experimentally. Although our approach makes many simplifying assumptions, these epitopes always correspond to clusters of identical amino acids, which our algorithm predicted to be contiguous on the outer surface. Our algorithm runs in O(n+m+r) time and O(n+m) space, where n and m are the lengths of the protein sequences, and r is the number of matching amino acids that appear in the same structural position of the α -helix in both sequences. Received June 7, 1997; revised March 23, 1998.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号