首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Microarray technology plays a significant role in cancer classification, where a large number of genes and samples are simultaneously analysed. For the efficient analysis of the microarray data, there is a great demand for the development of intelligent techniques. In this article, the authors propose a novel hybrid technique employing Fisher criterion, ReliefF, and extreme learning machine (ELM) based on the principle of chaotic emperor penguin optimisation algorithm (CEPO). EPO is a recently developed metaheuristic method. In the proposed method, initially, Fisher score and ReliefF are independently used as filters for relevant gene selection. Further, a novel population‐based metaheuristic, namely, CEPO was proposed to pre‐train the ELM by selecting the optimal input weights and hidden biases. The authors have successfully conducted experiments on seven well‐known data sets. To evaluate the effectiveness, the proposed method is compared with original EPO, genetic algorithm, and particle swarm optimisation‐based ELM along with other state‐of‐the‐art techniques. The experimental results show that the proposed framework achieves better accuracy as compared to the state‐of‐the‐art schemes. The efficacy of the proposed method is demonstrated in terms of accuracy, sensitivity, specificity, and F‐measure.Inspec keywords: genetic algorithms, pattern classification, biology computing, cancer, learning (artificial intelligence), search problems, particle swarm optimisationOther keywords: optimal input weights, data sets, original EPO, genetic algorithm, particle swarm optimisation‐based ELM, microarray cancer classification, microarray technology, microarray data, intelligent techniques, Fisher criterion, ReliefF, chaotic emperor penguin optimisation algorithm, CEPO, recently developed metaheuristic method, Fisher score, relevant gene selection, population‐based, chaotic penguin optimised extreme learning machine, F  相似文献   

2.
3.
Gene Regulatory Networks (GRNs) are reconstructed from the microarray gene expression data through diversified computational approaches. This process ensues in symmetric and diagonal interaction of gene pairs that cannot be modelled as direct activation, inhibition, and self‐regulatory interactions. The values of gene co‐expressions could help in identifying co‐regulations among them. The proposed approach aims at computing the differences in variances of co‐expressed genes rather than computing differences in values of mean expressions across experimental conditions. It adopts multivariate co‐variances using principal component analysis (PCA) to predict an asymmetric and non‐diagonal gene interaction matrix, to select only those gene pair interactions that exhibit the maximum variances in gene regulatory expressions. The asymmetric gene regulatory interactions help in identifying the controlling regulatory agents, thus lowering the false positive rate by minimizing the connections between previously unlinked network components. The experimental results on real as well as in silico datasets including time‐series RTX therapy, Arabidopsis thaliana, DREAM‐3, and DREAM‐8 datasets, in comparison with existing state‐of‐the‐art approaches demonstrated the enhanced performance of the proposed approach for predicting positive and negative feedback loops and self‐regulatory interactions. The generated GRNs hold the potential in determining the real nature of gene pair regulatory interactions.Inspec keywords: molecular biophysics, principal component analysis, genetics, biology computing, reverse engineeringOther keywords: controlling regulatory agents, interacting genes, unlinked network components, self‐regulatory interactions, gene pair regulatory interactions, self‐regulatory network motifs, reverse engineering gene regulatory networks, microarray gene expression data, diversified computational approaches, symmetric interaction, diagonal interaction, gene pairs, gene co‐expressions, co‐expressed genes, mean expressions, gene regulatory expressions, asymmetric gene regulatory interactions  相似文献   

4.
Gene‐expression data is being widely used for various clinical research. It represents expression levels of thousands of genes across the various experimental conditions simultaneously. Mining conditions specific hub genes from gene‐expression data is a challenging task. Conditions specific hub genes signify the functional behaviour of bicluster across the subset of conditions and can act as prognostic or diagnostic markers of the diseases. In this study, the authors have introduced a new approach for identifying conditions specific hub genes from the RNA‐Seq data using a biclustering algorithm. In the proposed approach, efficient ‘runibic’ biclustering algorithm, the concept of gene co‐expression network and concept of protein–protein interaction network have been used for getting better performance. The result shows that the proposed approach extracts biologically significant conditions specific hub genes which play an important role in various biological processes and pathways. These conditions specific hub genes can be used as prognostic or diagnostic biomarkers. Conditions specific hub genes will be helpful to reduce the analysis time and increase the accuracy of further research. Also, they summarised application of the proposed approach to the drug discovery process.Inspec keywords: proteins, data mining, cellular biophysics, drugs, genetics, diseases, RNA, medical computing, biology computing, molecular biophysicsOther keywords: experimental conditions, mining conditions specific hub genes, identifying conditions specific hub genes, RNA‐Seq data, gene co‐expression network, significant conditions specific hub genes, RNA‐Seq gene‐expression data  相似文献   

5.
Identifying drug–target interactions has been a key step for drug repositioning, drug discovery and drug design. Since it is expensive to determine the interactions experimentally, computational methods are needed for predicting interactions. In this work, the authors first propose a single‐view penalised graph (SPGraph) clustering approach to integrate drug structure and protein sequence data in a structural view. The SPGraph model does clustering on drugs and targets simultaneously such that the known drug–target interactions are best preserved in the clustering results. They then apply the SPGraph to a chemical view with drug response data and gene expression data in NCI‐60 cell lines. They further generalise the SPGraph to a multi‐view penalised graph (MPGraph) version, which can integrate the structural view and chemical view of the data. In the authors'' experiments, they compare their approach with some comparison partners, and the results show that the SPGraph could improve the prediction accuracy in a small scale, and the MPGraph can achieve around 10% improvements for the prediction accuracy. They finally give some new targets for 22 Food and Drug Administration approved drugs for drug repositioning, and some can be supported by other references.Inspec keywords: graphs, drug delivery systems, drugs, proteins, molecular biophysics, molecular configurations, optimisation, eigenvalues and eigenfunctions, Laplace equations, cancer, cellular biophysics, gene therapy, medical computingOther keywords: MPGraph, multiview penalised graph clustering, drug‐target interactions, drug repositioning, drug discovery, drug design, computational methods, single‐view penalized graph clustering approach, drug structure, protein sequence data, SPGraph model, optimisation problem, spectral clustering, eigenvalue decomposition, Laplacian model, gene expression data, NCI‐60 cell lines  相似文献   

6.
Non‐small cell lung cancer (NSCLC) is the most popular and dangerous type of lung cancer. Adjuvant chemotherapy (ACT) is the main treatment after surgery resection to prevent the patient from cancer recurrence. However, ACT could be toxic and unhelpful in some cases. Therefore, it is highly desired in clinical applications to predict the treatment outcomes of chemotherapy. Conventional methods of predicting cancer treatment rely solely on histopathology and the results are not reliable in some cases. This study aims at building a predictive model to identify who needs ACT treatment and who should avoid it. To this end, the authors propose an innovative method to identify NSCLC‐related prognostic genes from microarray gene‐expression datasets. They also propose a new model using gene‐expression programming algorithm for ACT classification. The proposed model was evaluated on integrated microarray datasets from four institutes and compared with four representative methods: general regression neural network, decision tree, support vector machine and naive Bayes. Evaluation results demonstrated the effectiveness of the proposed model with accuracy 89.8% which is higher than other representative models. They obtained four probes (four genes) that can get good prediction results. These genes are 204891_s_at (LCK), 208893_s_at (DUSP6), 202454_s_at (ERBB3) and 201076_at (MMD).Inspec keywords: neural nets, regression analysis, decision trees, surgery, medical computing, cancer, cellular biophysics, lung, genetics, support vector machines, Bayes methods, biochemistryOther keywords: cancer ACT prediction model, nonsmall cell lung cancer, adjuvant chemotherapy, surgery resection, cancer recurrence, conventional methods, cancer treatment, microarray gene‐expression technology, NSCLC treatment, ACT treatment, NSCLC‐related prognostic genes, microarray gene‐expression datasets, gene‐expression programming algorithm, ACT classification, ACT information, integrated microarray datasets, representative models, survival time, general regression neural network, decision tree, support vector machine, naive Bayes  相似文献   

7.
Inferring gene regulatory networks (GRNs) from microarray expression data are an important but challenging issue in systems biology. In this study, the authors propose a Bayesian information criterion (BIC)‐guided sparse regression approach for GRN reconstruction. This approach can adaptively model GRNs by optimising the l 1 ‐norm regularisation of sparse regression based on a modified version of BIC. The use of the regularisation strategy ensures the inferred GRNs to be as sparse as natural, while the modified BIC allows incorporating prior knowledge on expression regulation and thus avoids the overestimation of expression regulators as usual. Especially, the proposed method provides a clear interpretation of combinatorial regulations of gene expression by optimally extracting regulation coordination for a given target gene. Experimental results on both simulation data and real‐world microarray data demonstrate the competent performance of discovering regulatory relationships in GRN reconstruction.Inspec keywords: genetics, Bayes methods, genomics, regression analysis, inference mechanisms, bioinformaticsOther keywords: adaptive modelling, gene regulatory network, Bayesian information criterion‐guided sparse regression approach, GRN, microarray expression data, systems biology, GRN reconstruction, optimisation, l1 ‐norm regularisation  相似文献   

8.
Here, a two‐phase search strategy is proposed to identify the biomarkers in gene expression data set for the prostate cancer diagnosis. A statistical filtering method is initially employed to remove the noisiest data. In the first phase of the search strategy, a multi‐objective optimisation based on the binary particle swarm optimisation algorithm tuned by a chaotic method is proposed to select the optimal subset of genes with the minimum number of genes and the maximum classification accuracy. Finally, in the second phase of the search strategy, the cache‐based modification of the sequential forward floating selection algorithm is used to find the most discriminant genes from the optimal subset of genes selected in the first phase. The results of applying the proposed algorithm on the available challenging prostate cancer data set demonstrate that the proposed algorithm can perfectly identify the informative genes such that the classification accuracy, sensitivity, and specificity of 100% are achieved with only nine biomarkers.Inspec keywords: cancer, biological organs, optimisation, feature extraction, search problems, particle swarm optimisation, pattern classification, geneticsOther keywords: biomarkers, gene expression feature selection, prostate cancer diagnosis, heuristic–deterministic search strategy, two‐phase search strategy, gene expression data, statistical filtering method, noisiest data, multiobjective optimisation, particle swarm optimisation algorithm, chaotic method, selection algorithm, discriminant genes, available challenging prostate cancer data, informative genes  相似文献   

9.
Direct relationships between biological molecules connected in a gene co‐expression network tend to reflect real biological activities such as gene regulation, protein–protein interactions (PPIs), and metabolisation. As correlation‐based networks contain numerous indirect connections, those direct relationships are always ‘hidden’ in them. Compared with the global network, network communities imply more biological significance on predicting protein function, detecting protein complexes and studying network evolution. Therefore, identifying direct relationships in communities is a pervasive and important topic in the biological sciences. Unfortunately, this field has not been well studied. A major thrust of this study is to apply a deconvolution algorithm on communities stemming from different gene co‐expression networks, which are constructed by fixing different thresholds for robustness analysis. Using the fifth Dialogue on Reverse Engineering Assessment and Methods challenge (DREAM5) framework, the authors demonstrate that nearly all new communities extracted from a ‘deconvolution filter’ contain more genuine PPIs than before deconvolution.Inspec keywords: proteins, deconvolution, genetics, bioinformatics, biology computing, molecular biophysicsOther keywords: identifying genuine protein–protein interactions, gene co‐expression network, deconvolution method, direct relationships, biological molecules, biological activities, gene regulation, correlation‐based networks, numerous indirect connections, global network, network communities, biological significance, protein function, protein complexes, studying network evolution, biological sciences, different gene co‐expression networks  相似文献   

10.
Lung cancer is one of the deadliest diseases in the world. Non‐small cell lung cancer (NSCLC) is the most common and dangerous type of lung cancer. Despite the fact that NSCLC is preventable and curable for some cases if diagnosed at early stages, the vast majority of patients are diagnosed very late. Furthermore, NSCLC usually recurs sometime after treatment. Therefore, it is of paramount importance to predict NSCLC recurrence, so that specific and suitable treatments can be sought. Nonetheless, conventional methods of predicting cancer recurrence rely solely on histopathology data and predictions are not reliable in many cases. The microarray gene expression (GE) technology provides a promising and reliable way to predict NSCLC recurrence by analysing the GE of sample cells. This study proposes a new model from GE programming to use microarray datasets for NSCLC recurrence prediction. To this end, the authors also propose a hybrid method to rank and select relevant prognostic genes that are related to NSCLC recurrence prediction. The proposed model was evaluated on real NSCLC microarray datasets and compared with other representational models. The results demonstrated the effectiveness of the proposed model.Inspec keywords: lung, cancer, lab‐on‐a‐chip, genetics, patient diagnosisOther keywords: NSCLC recurrence prediction, microarray data, GE programming, nonsmall cell lung cancer, cancer recurrence, histopathology data, microarray gene expression, prognostic genes  相似文献   

11.
Basing on alternative splicing events (ASEs) databases, the authors herein aim to explore potential prognostic biomarkers for cervical squamous cell carcinoma (CESC). mRNA expression profiles and relevant clinical data of 223 patients with CESC were obtained from The Cancer Genome Atlas (TCGA). Correlated genes, ASEs and percent‐splice‐in (PSI) were downloaded from SpliceSeq, respectively. The PSI values of survival‐associated alternative splicing events (SASEs) were used to construct the basis of a prognostic index (PI). A protein–protein interaction (PPI) network of genes related to SASEs was generated by STRING and analysed with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). Consequently, 41,776 ASEs were discovered in 19,724 genes, 2596 of which linked with 3669 SASEs. The PPI network of SASEs related genes revealed that TP53 and UBA52 were core genes. The low‐risk group had a longer survival period than high‐risk counterparts, both groups being defined according to PI constructed upon the top 20 splicing events or PI on the overall splicing events. The AUC value of ROC reached up to 0.88, demonstrating the prognostic potential of PI in CESC. These findings suggested that ASEs involve in the pathogenesis of CESC and may serve as promising prognostic biomarkers for this female malignancy.Inspec keywords: gynaecology, molecular biophysics, genomics, proteins, cellular biophysics, genetics, medical computing, cancer, ontologies (artificial intelligence), RNAOther keywords: protein‐protein interaction network, CESC pathogenesis, gene ontology, Kyoto‐encyclopedia‐of‐genes‐and‐genomes, SASEs related genes, PPI network, survival‐associated alternative splicing events, PSI values, percent‐splice‐in, Cancer Genome Atlas, mRNA expression profiles, prognostic biomarkers, alternative splicing events databases, cervical squamous cell carcinoma, prognostic alternative splicing signature  相似文献   

12.
With rapid accumulation of functional relationships between biological molecules, knowledge‐based networks have been constructed and stocked in many databases. These networks provide curated and comprehensive information for functional linkages among genes and proteins, whereas their activities are highly related with specific phenotypes and conditions. To evaluate a knowledge‐based network in a specific condition, the consistency between its structure and conditionally specific gene expression profiling data are an important criterion. In this study, the authors propose a Gaussian graphical model to evaluate the documented regulatory networks by the consistency between network architectures and time course gene expression profiles. They derive a dynamic Bayesian network model to evaluate gene regulatory networks in both simulated and true time course microarray data. The regulatory networks are evaluated by matching network structure with gene expression to achieve consistency measurement. To demonstrate the effectiveness of the authors method, they identify significant regulatory networks in response to the time course of circadian rhythm. The knowledge‐based networks are screened and ranked by their structural consistencies with dynamic gene expression profiling.Inspec keywords: Bayes methods, biology computing, circadian rhythms, Gaussian processes, genetics, genomics, graphs, molecular biophysics, proteinsOther keywords: Gaussian graphical model, responsive regulatory networks, time course high‐throughput data, biological molecules, dynamic gene expression proflling, circadian rhythm, consistency measurement, matching network structure, simulated time course microarray data, true time course microarray data, dynamic Bayesian network model, time course gene expression proflles, network architectures, documented regulatory networks, speciflc gene expression proflling data, phenotypes, proteins, functional linkages, databases, knowledge‐based networks  相似文献   

13.
14.
15.
Lung cancer is one of the leading causes of death in both the USA and Taiwan, and it is thought that the cause of cancer could be because of the gain of function of an oncoprotein or the loss of function of a tumour suppressor protein. Consequently, these proteins are potential targets for drugs. In this study, differentially expressed genes are identified, via an expression dataset generated from lung adenocarcinoma tumour and adjacent non‐tumour tissues. This study has integrated many complementary resources, that is, microarray, protein‐protein interaction and protein complex. After constructing the lung cancer protein‐protein interaction network (PPIN), the authors performed graph theory analysis of PPIN. Highly dense modules are identified, which are potential cancer‐associated protein complexes. Up‐ and down‐regulated communities were used as queries to perform functional enrichment analysis. Enriched biological processes and pathways are determined. These sets of up‐ and down‐regulated genes were submitted to the Connectivity Map web resource to identify potential drugs. The authors'' findings suggested that eight drugs from DrugBank and three drugs from NCBI can potentially reverse certain up‐ and down‐regulated genes'' expression. In conclusion, this study provides a systematic strategy to discover potential drugs and target genes for lung cancer.Inspec keywords: cellular biophysics, lung, cancer, drugs, genetics, tumours, lab‐on‐a‐chip, proteins, molecular biophysics, graph theory, query processing, medical computingOther keywords: down‐regulated gene expression, up‐regulated gene expression, potential target genes, DrugBank, potential drugs, connectivity map Web resource, biological processes, functional enrichment analysis, up‐regulated communities, down‐regulated communities, cancer‐associated protein complexes, k‐communities, highly‐dense modules, PPIN, graph theory analysis, lung cancer protein‐protein interaction network, MIPS, BioGrid, ArrayExpress, microarray, nontumour tissues, human lung adenocarcinoma tumour, bioconductor package, tumour suppressor protein, oncoprotein, nonsmall cell lung cancer, in silico identification  相似文献   

16.
Prediction of cardiovascular disease (CVD) is a critical challenge in the area of clinical data analysis. In this study, an efficient heart disease prediction is developed based on optimal feature selection. Initially, the data pre‐processing process is performed using data cleaning, data transformation, missing values imputation, and data normalisation. Then the decision function‐based chaotic salp swarm (DFCSS) algorithm is used to select the optimal features in the feature selection process. Then the chosen attributes are given to the improved Elman neural network (IENN) for data classification. Here, the sailfish optimisation (SFO) algorithm is used to compute the optimal weight value of IENN. The combination of DFCSS–IENN‐based SFO (IESFO) algorithm effectively predicts heart disease. The proposed (DFCSS–IESFO) approach is implemented in the Python environment using two different datasets such as the University of California Irvine (UCI) Cleveland heart disease dataset and CVD dataset. The simulation results proved that the proposed scheme achieved a high‐classification accuracy of 98.7% for the CVD dataset and 98% for the UCI dataset compared to other classifiers, such as support vector machine, K‐nearest neighbour, Elman neural network, Gaussian Naive Bayes, logistic regression, random forest, and decision tree.Inspec keywords: cardiovascular system, medical diagnostic computing, feature extraction, regression analysis, data mining, learning (artificial intelligence), Bayes methods, neural nets, support vector machines, diseases, pattern classification, data handling, decision trees, cardiology, data analysis, feature selectionOther keywords: efficient heart disease prediction‐based, optimal feature selection, improved Elman‐SFO, cardiovascular disease, clinical data analysis, data pre‐processing process, data cleaning, data transformation, values imputation, data normalisation, decision function‐based chaotic salp swarm algorithm, optimal features, feature selection process, improved Elman neural network, data classification, sailfish optimisation algorithm, optimal weight value, DFCSS–IENN‐based SFO algorithm, DFCSS–IESFO, California Irvine Cleveland heart disease dataset, CVD dataset, high‐classification accuracy  相似文献   

17.
Lung cancer is a leading cause of cancer‐related death worldwide. The early diagnosis of cancer has demonstrated to be greatly helpful for curing the disease effectively. Microarray technology provides a promising approach of exploiting gene profiles for cancer diagnosis. In this study, the authors propose a gene expression programming (GEP)‐based model to predict lung cancer from microarray data. The authors use two gene selection methods to extract the significant lung cancer related genes, and accordingly propose different GEP‐based prediction models. Prediction performance evaluations and comparisons between the authors’ GEP models and three representative machine learning methods, support vector machine, multi‐layer perceptron and radial basis function neural network, were conducted thoroughly on real microarray lung cancer datasets. Reliability was assessed by the cross‐data set validation. The experimental results show that the GEP model using fewer feature genes outperformed other models in terms of accuracy, sensitivity, specificity and area under the receiver operating characteristic curve. It is concluded that GEP model is a better solution to lung cancer prediction problems.Inspec keywords: lung, cancer, medical diagnostic computing, patient diagnosis, genetic algorithms, feature selection, learning (artificial intelligence), support vector machines, multilayer perceptrons, radial basis function networks, reliability, sensitivity analysisOther keywords: lung cancer prediction, cancer‐related death, cancer diagnosis, gene profiles, gene expression programming‐based model, gene selection, GEP‐based prediction models, prediction performance evaluations, representative machine learning methods, support vector machine, multilayer perceptron, radial basis function neural network, real microarray lung cancer datasets, cross‐data set validation, reliability, receiver operating characteristic curve  相似文献   

18.
Lung adenocarcinoma is one of the major causes of mortality. Current methods of diagnosis can be improved through identification of disease specific biomarkers. MicroRNAs are small non‐coding regulators of gene expression, which can be potential biomarkers in various diseases. Thus, the main objective of this study was to gain mechanistic insights into genetic abnormalities occurring in lung adenocarcinoma by implementing an integrative analysis of miRNAs and mRNAs expression profiles in the case of both smokers and non‐smokers. Differential expression was analysed by comparing publicly available lung adenocarcinoma samples with controls. Furthermore, weighted gene co‐expression network analysis is performed which revealed mRNAs and miRNAs significantly correlated with lung adenocarcinoma. Moreover, an integrative analysis resulted in identification of several miRNA–mRNA pairs which were significantly dysregulated in non‐smokers with lung adenocarcinoma. Also two pairs (miR‐133b/Protein Kinase C Zeta (PRKCZ) and miR‐557/STEAP3) were found specifically dysregulated in smokers. Pathway analysis further revealed their role in important signalling pathways including cell cycle. This analysis has not only increased the authors’ understanding about lung adenocarcinoma but also proposed potential biomarkers. However, further wet laboratory studies are required for the validation of these potential biomarkers which can be used to diagnose lung adenocarcinoma.Inspec keywords: cancer, molecular biophysics, patient diagnosis, tumours, RNA, proteins, lung, genetics, medical diagnostic computing, molecular configurationsOther keywords: miRNAs expression profiles, mRNAs expression profiles, smokers, nonsmokers, integrative analysis, lung adenocarcinoma, microRNAs, disease specific biomarkers, noncoding regulators, genetic abnormalities, weighted gene coexpression network analysis  相似文献   

19.
The knowledge on the biological molecular mechanisms underlying cancer is important for the precise diagnosis and treatment of cancer patients. Detecting dysregulated pathways in cancer can provide insights into the mechanism of cancer and help to detect novel drug targets. Based on the wide existing mutual exclusivity among mutated genes and the interrelationship between gene mutations and expression changes, this study presents a network‐based method to detect the dysregulated pathways from gene mutations and expression data of the glioblastoma cancer. First, the authors construct a gene network based on mutual exclusivity between each pair of genes and the interaction between gene mutations and expression changes. Then they detect all complete subgraphs using CFinder clustering algorithm in the constructed gene network. Next, the two gene sets whose overlapping scores are above a specific threshold are merged. Finally, they obtain two dysregulated pathways in which there are glioblastoma‐related multiple genes which are closely related to the two subtypes of glioblastoma. The results show that one dysregulated pathway revolving around epidermal growth factor receptor is likely to be associated with the primary subtype of glioblastoma, and the other dysregulated pathway revolving around TP53 is likely to be associated with the secondary subtype of glioblastoma.Inspec keywords: cancer, tumours, drugs, brain, neurophysiology, genetic algorithms, genetics, skin, proteins, molecular biophysics, genomics, patient diagnosis, molecular configurationsOther keywords: network‐based method, dysregulated pathways detection, glioblastoma cancer, biological molecular mechanisms, precise diagnosis, cancer patient treatment, drug targets, mutual exclusivity, mutated genes, gene mutations, expression changes, expression data, CFinder clustering algorithm, constructed gene network, gene sets, overlapping scores, glioblastoma‐related multiple genes, epidermal growth factor receptor, TP53, secondary subtype  相似文献   

20.
Network motifs are recurrent and over‐represented patterns having biological relevance. This is one of the important local properties of biological networks. Network motif discovery finds important applications in many areas such as functional analysis of biological components, the validity of network composition, classification of networks, disease discovery, identification of unique subunits etc. The discovery of network motifs is a computationally challenging task due to the large size of real networks, and the exponential increase of search space with respect to network size and motif size. This problem also includes the subgraph isomorphism check, which is Nondeterministic Polynomial (NP)‐complete. Several tools and algorithms have been designed in the last few years to address this problem with encouraging results. These tools and algorithms can be classified into various categories based on exact census, mapping, pattern growth, and so on. In this study, critical aspects of network motif discovery, design principles of background algorithms, and their functionality have been reviewed with their strengths and limitations. The performances of state‐of‐art algorithms are discussed in terms of runtime efficiency, scalability, and space requirement. The future scope, research direction, and challenges of the existing algorithms are presented at the end of the study.Inspec keywords: computational complexity, graph theory, biology, search problemsOther keywords: network size, motif size, network motif discovery, biological networks, network composition, recurrent patterns, over‐represented patterns, local properties, search space, subgraph isomorphism check, NP‐complete problem, NP‐complete problem, exact census, design principles, background algorithms, runtime efficiency, space requirement  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号