首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Inferring gene networks from longitudinal gene expression microarrays is a crucial step towards the study of gene regulatory mechanisms. A decade ago, expensive microarray technology restricted the number of samples undergoing gene expression profiling in single studies, leading the inference algorithms that assume stationary gene networks to the best solution. Thanks to decreasing cost of modern microarray technologies, more gene expression profiles can be assessed in single studies. With more samples available, we can relax the stationarity assumption and develop a method to infer dynamic gene networks, which can reflect more realistic biology where genes adaptively orchestrate each other. This paper applied the framework of dynamic Bayesian networks to infer adaptive gene interactions by identifying individual transition networks between pairs of consecutive times. Due to high computational burden of inferring the interconnection patterns among all genes over time, we designed a parallelizable inference algorithm to make feasible the task. We validated our approach by two clinical studies: yellow fever vaccination and mechanical periodontal therapy. The inferred dynamic networks achieved more than 90% predictive accuracy, a significant improvement when compared to stationary models (p?<?0.05). The adaptive models can help explain the induction of innate immunology in greater details after yellow fever vaccination and interpret the anti-inflammatory effect of mechanical periodontal therapy.  相似文献   

2.
High-throughput gene expression technologies such as microarrays have been utilized in a variety of scientific applications. In this article, we develop multivariate techniques for visualizing gene regulatory networks using independent components analysis (ICA) techniques. A desirable feature of the ICA method is that it approximates a biological model for the gene expression. The methods are outlined and illustrated with application to yeast gene expression data.  相似文献   

3.
一种肿瘤基因表达数据的知识提取方法   总被引:7,自引:2,他引:7       下载免费PDF全文
李颖新  刘全金  阮晓钢 《电子学报》2004,32(9):1479-1482
本文以多发性骨髓瘤的基因表达数据为例,利用数据挖掘技术,提出了一种针对基因表达数据进行知识发现的方法.该方法通过计算基因的信息增益,结合神经网络,找出了特征基因集合,最后利用决策树进行特征规则的提取,给出了基于多发性骨髓瘤数据样本的产生式规则,为生物医学研究提供了一种分析和研究基因表达数据的参考方法.实验结果表明了该方法的有效性.  相似文献   

4.
5.
张媛  贾克斌  ZHANG Aidong 《电子学报》2014,42(12):2337-2344
结合多种生物数据分析蛋白质相互作用网络(Protein-Protein Interaction Network,PPIN)中的功能模块结构,是目前蛋白质功能计算分析领域亟待解决的难题之一.本文提出了一种基于聚合非负矩阵分解(Collective Non-negative Matrix Factorization,CoNMF)的多视图一致性功能模块检测方法,该方法同时逼近多视图数据,寻找统一的最优解达到对原多数据的最优近似.根据该统一解得到功能模块关系,同时该方法能够找到可重叠性的功能模块.实验结果显示本文所提出算法通过融合基因本体、基因表达谱与PPIN数据,在模块检测准确度上有一定提高,检测出的蛋白质功能模块具有真实生物意义.  相似文献   

6.
The massive scale and variability of microarray gene data creates new and challenging problems of signal extraction, gene clustering, and data mining, especially for temporal gene profiles. Many data mining methods for finding interesting gene expression patterns are based on thresholding single discriminants, e.g. the ratio of between-class to within-class variation or correlation to a template. Here a different approach is introduced for extracting information from gene microarrays. The approach is based on multiple objective optimization and we call it Pareto front analysis (PFA). This method establishes a ranking of genes according to estimated probabilities that each gene is Pareto-optimal, i.e., that it lies on the Pareto front of the multiple objective scattergram. Both a model-driven Bayesian Pareto method and a data-driven non-parametric Pareto method, based on rank-order statistics, are presented. The methods are illustrated for two gene microarray experiments.  相似文献   

7.
葛菲  马尽文 《信号处理》2005,21(3):312-315
大规模基因表达谱为肿瘤诊断提供了更为可靠和细致的生物数据,但相关基因的选取是对这些数据进行分析的关键。本文从Kullback-Leiber判别信息的角度对于肿瘤相关基因的选取进行了研究。根据肿瘤相关基因和无关基因的表达水平值分布的特性,我们提出了一种基于信息准则的基因选取方法。进一步,我们将这种方法应用到肿瘤诊断上,并根据支持向量机(SVM)对相关基因表达谱数据进行训练建立肿瘤诊断模型。实验结果表明这种方法是有效的,依此所建立的诊断模型可使得在结肠癌数据集和白血病数据集上的诊断(预测)正确率分别高达94.4%和100%石。  相似文献   

8.
王雪松  谷阳阳  程玉虎 《电子学报》2010,38(11):2518-2522
 借鉴复杂网络的分析思想和方法,采用规范化Laplace矩阵和K均值聚类法对基因调控网络进行多社团划分,同时给出每个社团内部和社团之间基因的相互作用情况.另外,为反映基因之间真实的相互作用过程和提高建模精度,在社团划分之前,采用时间序列谱分析法对基因表达时延进行精确估计.酵母细胞周期基因调控关系分析的实验结果表明,本文所提方法能更准确地反映基因之间的相互作用过程和提供基因调控模型的细节.  相似文献   

9.
李颖新  阮晓钢 《电子学报》2005,33(4):651-655
利用肿瘤基因表达谱建立有效的"预测性"分类模型,对肿瘤的不同亚型进行准确判别并找出决定样本类别的一组特征基因是当前生物信息学研究的重要课题.本文在分析肿瘤基因表达谱特征的基础上,以急性白血病的基因表达谱为例,研究了肿瘤亚型识别与分类特征基因选取问题.在类别可分离性判据的问题上,修正了已有的"信噪比"指标,据此进行无关基因的剔除,并以支持向量机作为分类器进行肿瘤亚型的识别.在特征基因选取问题上,本文从生物学分析出发,首先剔除无关基因和具有较强相关性的冗余基因,然后采用顺序浮动搜索算法进行分类特征基因的选取.实验结果表明了上述方法的可行性和有效性.  相似文献   

10.
Predicting gene functions is a challenge for biologists in the postgenomic era. Interactions among genes and their products compose networks that can be used to infer gene functions. Most previous studies adopt a linkage assumption, i.e., they assume that gene interactions indicate functional similarities between connected genes. In this study, we propose to use a gene's context graph, i.e., the gene interaction network associated with the focal gene, to infer its functions. In a kernel-based machine-learning framework, we design a context graph kernel to capture the information in context graphs. Our experimental study on a testbed of p53-related genes demonstrates the advantage of using indirect gene interactions and shows the empirical superiority of the proposed approach over linkage-assumption-based methods, such as the algorithm to minimize inconsistent connected genes and diffusion kernels.   相似文献   

11.
李辉  王金莲 《电子学报》2008,36(5):989-992
 本文从肿瘤基因表达谱分析入手,研究并选取胃癌相关标志基因集合,以此集合为基础抽取甄别肿瘤与正常组织的基因分类规则集,进而建立起肿瘤预测模型.首先,以支持向量机为分类器用特征基因集合的样本识别率为适应度函数,采用遗传算法对特征基因进行筛选.然后用决策树抽取特征基因的规则集,结合肿瘤分子生物学文献和生物实验建立肿瘤预测模型.最后通过对胃癌基因表达谱数据的分析,建立了胃癌预测模型,结果表明该模型对胃癌分子生物学实验和临床诊断具有一定的指导意义和参考价值.  相似文献   

12.
13.
This paper proposes a dynamic-model-based method for selecting significantly expressed (SE) genes from their time-course expression profiles. A gene is considered to be SE if its time-course expression profile is more likely time-dependent than random. The proposed method describes a time-dependent gene expression profile by a nonzero-order autoregressive (AR) model, and a time-independent gene expression profile by a zero-order AR model. Akaike information criterion (AIC) is used to compare the models and subsequently determine whether a time-course gene expression profile is time-independent or time-dependent. The performance of the proposed method is investigated on both a synthetic dataset and a real-life biological dataset in terms of the false discovery rate (FDR) and the false nondiscovery rate (FNR). The results show that the proposed method is valid for selecting SE genes from their time-course expression profiles.   相似文献   

14.
Genomics entails the study of large sets of genes with the goal of understanding collective gene function, rather than just that of individual genes. Genomic signal processing (GSP) is the engineering discipline that studies the processing of genomic signals. Since regulatory decisions within the cell utilize numerous inputs, analytical tools are necessary to model the multivariate influences on decision-making produced by complex genetic networks. Genomic signals must be processed to characterize their regulatory effects and their relationship to changes at both the genotypic and phenotypic levels. The aim of GSP is to integrate the theory and methods of signal processing with the global understanding of genomics, placing special emphasis on genomic regulation. GSP encompasses various methodologies related to signal profiles: detection, prediction, classification, control, and statistical and dynamical modeling of gene networks. In this article, we give an overview of GSP and describe how pattern recognition and network analysis are central to diagnosis and therapy for genetic diseases.  相似文献   

15.
16.
When designing a gene regulatory network, except in rare circumstances there will be inconsistencies in the data. Modeling data inconsistencies fits naturally into the framework of probabilistic Boolean networks (PBNs). This model consists of a family of deterministic models and the overall model is based on random switching between constituent networks, each of which determines a context. A previous paper has proposed an inference procedure for PBNs to achieve data consistency within constituent networks. This paper proposes optimization methods targeted at two data-consistent design issues having to do with network structure: (1) generalization (namely, model selection) arising from the one-to-many mapping between the data set and PBN model; (2) model reduction under constraint on network connectivity, which is typically made for computational, statistical, or biological reasons. Regarding generalization, we combine connectivity and minimal logical realization to formulate the optimality criterion and propose two algorithms to solve it, the second algorithm guaranteeing a minimally connected PBN. Regarding constrained connectivity, we rephrase it as a lossy coding problem and develop an algorithm to find a best subset of predictors from the full set of predictors with the objective of minimizing probability of prediction error  相似文献   

17.
18.
阿尔茨海默症(Alzheimer’s disease,AD)基因表达谱数据具有高维性、高噪声、高冗余性等特点,使得AD特异性基因的搜索空间巨大,搜索算法时间长,降低了算法的挖掘性能及其生物学分析。因此对其基因表达谱数据进行去噪和降维预处理是十分必要的。文中首先利用小波包变换-SAM方法对数据进行降维去噪,实验结果证明了小波包方法能较好地提取基因表达谱有用信息;然后应用快速独立成分分析(FastICA)算法对预处理后的数据进行矩阵分解分析,并根据独立分量选取特异性基因。在此基础上的样本分类实验表明,FastICA提取的特异性基因具有较高的显著性,能够提高样本的分类结果。同时,通过所提取特异性基因的富集性分析,文中给出了这些基因在阿尔茨海默症数据集中聚类情况及其基因表达情况,为AD的生物学及医学病理分析提供有利的依据。  相似文献   

19.
A significant amount of attention has recently been focused on modeling of gene regulatory networks. Two frequently used large-scale modeling frameworks are Bayesian networks (BNs) and Boolean networks, the latter one being a special case of its recent stochastic extension, probabilistic Boolean networks (PBNs). PBN is a promising model class that generalizes the standard rule-based interactions of Boolean networks into the stochastic setting. Dynamic Bayesian networks (DBNs) is a general and versatile model class that is able to represent complex temporal stochastic processes and has also been proposed as a model for gene regulatory systems. In this paper, we concentrate on these two model classes and demonstrate that PBNs and a certain subclass of DBNs can represent the same joint probability distribution over their common variables. The major benefit of introducing the relationships between the models is that it opens up the possibility of applying the standard tools of DBNs to PBNs and vice versa. Hence, the standard learning tools of DBNs can be applied in the context of PBNs, and the inference methods give a natural way of handling the missing values in PBNs which are often present in gene expression measurements. Conversely, the tools for controlling the stationary behavior of the networks, tools for projecting networks onto sub-networks, and efficient learning schemes can be used for DBNs. In other words, the introduced relationships between the models extend the collection of analysis tools for both model classes.  相似文献   

20.
cDNA生物芯片表达数据广泛用于生物医学研究,利用计算机对其进行处理还有很多挑战性课题。该文提出了一种新的基于不变基因的多类生物芯片监督型集合cDNA表达数据标准化方法。在达到标准化的同时,该方法也可直接用于基因表达数据的特征选择,实验证明效果较好。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号