首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Protein-Protein Interactions (PPIs) play essential roles in most cellular processes. Knowledge of PPIs is becoming increasingly more important, which has prompted the development of technologies that are capable of discovering large-scale PPIs. Although many high-throughput biological technologies have been proposed to detect PPIs, there are unavoidable shortcomings, including cost, time intensity, and inherently high false positive and false negative rates. For the sake of these reasons, in silico methods are attracting much attention due to their good performances in predicting PPIs. In this paper, we propose a novel computational method known as RVM-AB that combines the Relevance Vector Machine (RVM) model and Average Blocks (AB) to predict PPIs from protein sequences. The main improvements are the results of representing protein sequences using the AB feature representation on a Position Specific Scoring Matrix (PSSM), reducing the influence of noise using a Principal Component Analysis (PCA), and using a Relevance Vector Machine (RVM) based classifier. We performed five-fold cross-validation experiments on yeast and Helicobacter pylori datasets, and achieved very high accuracies of 92.98% and 95.58% respectively, which is significantly better than previous works. In addition, we also obtained good prediction accuracies of 88.31%, 89.46%, 91.08%, 91.55%, and 94.81% on other five independent datasets C. elegans, M. musculus, H. sapiens, H. pylori, and E. coli for cross-species prediction. To further evaluate the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the yeast dataset. The experimental results demonstrate that our RVM-AB method is obviously better than the SVM-based method. The promising experimental results show the efficiency and simplicity of the proposed method, which can be an automatic decision support tool. To facilitate extensive studies for future proteomics research, we developed a freely available web server called RVMAB-PPI in Hypertext Preprocessor (PHP) for predicting PPIs. The web server including source code and the datasets are available at http://219.219.62.123:8888/ppi_ab/.  相似文献   

2.
3.
De novo drug design is a computational approach that generates novel molecular structures from atomic building blocks with no a priori relationships. Conventional methods include structure-based and ligand-based design, which depend on the properties of the active site of a biological target or its known active binders, respectively. Artificial intelligence, including ma-chine learning, is an emerging field that has positively impacted the drug discovery process. Deep reinforcement learning is a subdivision of machine learning that combines artificial neural networks with reinforcement-learning architectures. This method has successfully been em-ployed to develop novel de novo drug design approaches using a variety of artificial networks including recurrent neural networks, convolutional neural networks, generative adversarial networks, and autoencoders. This review article summarizes advances in de novo drug design, from conventional growth algorithms to advanced machine-learning methodologies and high-lights hot topics for further development.  相似文献   

4.
5.
The new advances in deep learning methods have influenced many aspects of scientific research, including the study of the protein system. The prediction of proteins’ 3D structural components is now heavily dependent on machine learning techniques that interpret how protein sequences and their homology govern the inter-residue contacts and structural organization. Especially, methods employing deep neural networks have had a significant impact on recent CASP13 and CASP14 competition. Here, we explore the recent applications of deep learning methods in the protein structure prediction area. We also look at the potential opportunities for deep learning methods to identify unknown protein structures and functions to be discovered and help guide drug–target interactions. Although significant problems still need to be addressed, we expect these techniques in the near future to play crucial roles in protein structural bioinformatics as well as in drug discovery.  相似文献   

6.
7.
Streptococcus pyogenes, or group A Streptococcus (GAS), a gram-positive bacterium, is implicated in a wide range of clinical manifestations and life-threatening diseases. One of the key virulence factors of GAS is streptopain, a C10 family cysteine peptidase. Since its discovery, various homologs of streptopain have been reported from other bacterial species. With the increased affordability of sequencing, a significant increase in the number of potential C10 family-like sequences in the public databases is anticipated, posing a challenge in classifying such sequences. Sequence-similarity-based tools are the methods of choice to identify such streptopain-like sequences. However, these methods depend on some level of sequence similarity between the existing C10 family and the target sequences. Therefore, in this work, we propose a novel predictor, C10Pred, for the prediction of C10 peptidases using sequence-derived optimal features. C10Pred is a support vector machine (SVM) based model which is efficient in predicting C10 enzymes with an overall accuracy of 92.7% and Matthews’ correlation coefficient (MCC) value of 0.855 when tested on an independent dataset. We anticipate that C10Pred will serve as a handy tool to classify novel streptopain-like proteins belonging to the C10 family and offer essential information.  相似文献   

8.
Theophylline, a typical representative of active pharmaceutical ingredients, was selected to study the characteristics of experimental and theoretical solubility measured at 25 °C in a broad range of solvents, including neat, binary mixtures and ternary natural deep eutectics (NADES) prepared with choline chloride, polyols and water. There was a strong synergistic effect of organic solvents mixed with water, and among the experimentally studied binary systems, the one containing DMSO with water in unimolar proportions was found to be the most effective in theophylline dissolution. Likewise, for NADES, the addition of water (0.2 molar fraction) resulted in increased solubility compared to pure eutectics, with the highest solubilisation potential offered by the composition of choline chloride with glycerol. The ensemble of Statistica Automated Neural Networks (SANNs) developed using intermolecular interactions in pure systems has been found to be a very accurate model for solubility computations. This machine learning protocol was also applied as an extensive screening for potential solvents with higher solubility of theophylline. Such solvents were identified in all three subgroups, including neat solvents, binary mixtures and ternary NADES systems. Some methodological considerations of SANNs applications for future modelling were also provided. Although the developed protocol is focused exclusively on theophylline solubility, it also has general importance and can be used for the development of predictive models adequate for solvent screening of other compounds in a variety of systems. Formulation of such a model offers rational guidance for the selection of proper candidates as solubilisers in the designed solvents screening.  相似文献   

9.
10.
Identifying secretory proteins from blood, saliva or other body fluids has become an effective method of diagnosing diseases. Existing secretory protein prediction methods are mainly based on conventional machine learning algorithms and are highly dependent on the feature set from the protein. In this article, we propose a deep learning model based on the capsule network and transformer architecture, SecProCT, to predict secretory proteins using only amino acid sequences. The proposed model was validated using cross-validation and achieved 0.921 and 0.892 accuracy for predicting blood-secretory proteins and saliva-secretory proteins, respectively. Meanwhile, the proposed model was validated on an independent test set and achieved 0.917 and 0.905 accuracy for predicting blood-secretory proteins and saliva-secretory proteins, respectively, which are better than conventional machine learning methods and other deep learning methods for biological sequence analysis. The main contributions of this article are as follows: (1) a deep learning model based on a capsule network and transformer architecture is proposed for predicting secretory proteins. The results of this model are better than the those of existing conventional machine learning methods and deep learning methods for biological sequence analysis; (2) only amino acid sequences are used in the proposed model, which overcomes the high dependence of existing methods on the annotated protein features; (3) the proposed model can accurately predict most experimentally verified secretory proteins and cancer protein biomarkers in blood and saliva.  相似文献   

11.
12.
13.
Genetic variations have a multitude of effects on proteins. A substantial number of variations affect protein–solvent interactions, either aggregation or solubility. Aggregation is often related to structural alterations, whereas solubilizable proteins in the solid phase can be made again soluble by dilution. Solubility is a central protein property and when reduced can lead to diseases. We developed a prediction method, PON-Sol2, to identify amino acid substitutions that increase, decrease, or have no effect on the protein solubility. The method is a machine learning tool utilizing gradient boosting algorithm and was trained on a large dataset of variants with different outcomes after the selection of features among a large number of tested properties. The method is fast and has high performance. The normalized correct prediction rate for three states is 0.656, and the normalized GC2 score is 0.312 in 10-fold cross-validation. The corresponding numbers in the blind test were 0.545 and 0.157. The performance was superior in comparison to previous methods. The PON-Sol2 predictor is freely available. It can be used to predict the solubility effects of variants for any organism, even in large-scale projects.  相似文献   

14.
15.
Signal recognition particle (SRP) is an RNA and protein complex that exists in all domains of life. It consists of one protein and one noncoding RNA in some bacteria. It is more complex in eukaryotes and consists of six proteins and one noncoding RNA in mammals. In the eukaryotic cytoplasm, SRP co-translationally targets proteins to the endoplasmic reticulum and prevents misfolding and aggregation of the secretory proteins in the cytoplasm. It was demonstrated recently that SRP also possesses an earlier unknown function, the protection of mRNAs of secretory proteins from degradation. In this review, we analyze the progress in studies of SRPs from different organisms, SRP biogenesis, its structure, and function in protein targeting and mRNA protection.  相似文献   

16.
Identifying new disease indications for existing drugs can help facilitate drug development and reduce development cost. The previous drug–disease association prediction methods focused on data about drugs and diseases from multiple sources. However, they did not deeply integrate the neighbor topological information of drug and disease nodes from various meta-path perspectives. We propose a prediction method called NAPred to encode and integrate meta-path-level neighbor topologies, multiple kinds of drug attributes, and drug-related and disease-related similarities and associations. The multiple kinds of similarities between drugs reflect the degrees of similarity between two drugs from different perspectives. Therefore, we constructed three drug–disease heterogeneous networks according to these drug similarities, respectively. A learning framework based on fully connected neural networks and a convolutional neural network with an attention mechanism is proposed to learn information of the neighbor nodes of a pair of drug and disease nodes. The multiple neighbor sets composed of different kinds of nodes were formed respectively based on meta-paths with different semantics and different scales. We established the attention mechanisms at the neighbor-scale level and at the neighbor topology level to learn enhanced neighbor feature representations and enhanced neighbor topological representations. A convolutional-autoencoder-based module is proposed to encode the attributes of the drug–disease pair in three heterogeneous networks. Extensive experimental results indicated that NAPred outperformed several state-of-the-art methods for drug–disease association prediction, and the improved recall rates demonstrated that NAPred was able to retrieve more actual drug–disease associations from the top-ranked candidates. Case studies on five drugs further demonstrated the ability of NAPred to identify potential drug-related disease candidates.  相似文献   

17.
A new approach to monitor disulfide-bond reduction in the vicinity of aromatic cluster(s) has been derived by using the near-UV range (λ=266–293 nm) of electronic circular dichroism (ECD) spectra. By combining the results from NMR and ECD spectroscopy, the 3D fold characteristics and associated reduction rate constants (k) of E19_SS, which is a highly thermostable, disulfide-bond reinforced 39-amino acid long exenatide mimetic, and its N-terminally truncated derivatives have been determined under different experimental conditions. Single disulfide bond reduction of the E19_SS model (with an 18-fold excess of tris(2-carboxyethyl)phosphine, pH 7, 37 °C) takes hours, which is 20–30 times longer than that expected, and thus, would not reach completion by applying commonly used reduction protocols. It is found that structural, steric, and electrostatic factors influence the reduction rate, resulting in orders of magnitude differences in reduction half-lives (900>t1/2>1 min) even for structurally similar, well-folded derivatives of a small model protein.  相似文献   

18.
19.
Dephosphorylation of target proteins at serine/threonine residues is one of the most crucial mechanisms regulating their activity and, consequently, the cellular functions. The role of phosphatases in synaptic plasticity, especially in long-term depression or depotentiation, has been reported. We studied serine/threonine phosphatase activity during the protein synthesis blocker (PSB)-induced impairment of long-term potentiation (LTP). Established protein phosphatase 2B (PP2B, calcineurin) inhibitor cyclosporin A prevented the LTP early phase (E-LTP) decline produced by pretreatment of hippocampal slices with cycloheximide or anisomycin. For the first time, we directly measured serine/threonine phosphatase activity during E-LTP, and its significant increase in PSB-treated slices was demonstrated. Nitric oxide (NO) donor SNAP also heightened phosphatase activity in the same manner as PSB, and simultaneous application of anisomycin + SNAP had no synergistic effect. Direct measurement of the NO production in hippocampal slices by the NO-specific fluorescent probe DAF-FM revealed that PSBs strongly stimulate the NO concentration in all studied brain areas: CA1, CA3, and dentate gyrus (DG). Cyclosporin A fully abolished the PSB-induced NO production in the hippocampus, suggesting a close relationship between nNOS and PP2B activity. Surprisingly, cyclosporin A alone impaired short-term plasticity in CA1 by decreasing paired-pulse facilitation, which suggests bi-directionality of the influences of PP2B in the hippocampus. In conclusion, we proposed a minimal model of signaling events that occur during LTP induction in normal conditions and the PSB-treated slices.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号