首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Microarray-based global gene expression profiling, with the use of sophisticated statistical algorithms is providing new insights into the pathogenesis of autoimmune diseases. We have applied a novel statistical technique for gene selection based on machine learning approaches to analyze microarray expression data gathered from patients with systemic lupus erythematosus (SLE) and primary antiphospholipid syndrome (PAPS), two autoimmune diseases of unknown genetic origin that share many common features. The methodology included a combination of three data discretization policies, a consensus gene selection method, and a multivariate correlation measurement. A set of 150 genes was found to discriminate SLE and PAPS patients from healthy individuals. Statistical validations demonstrate the relevance of this gene set from an univariate and multivariate perspective. Moreover, functional characterization of these genes identified an interferon-regulated gene signature, consistent with previous reports. It also revealed the existence of other regulatory pathways, including those regulated by PTEN, TNF, and BCL-2, which are altered in SLE and PAPS. Remarkably, a significant number of these genes carry E2F binding motifs in their promoters, projecting a role for E2F in the regulation of autoimmunity.  相似文献   

2.
Inferring gene networks from longitudinal gene expression microarrays is a crucial step towards the study of gene regulatory mechanisms. A decade ago, expensive microarray technology restricted the number of samples undergoing gene expression profiling in single studies, leading the inference algorithms that assume stationary gene networks to the best solution. Thanks to decreasing cost of modern microarray technologies, more gene expression profiles can be assessed in single studies. With more samples available, we can relax the stationarity assumption and develop a method to infer dynamic gene networks, which can reflect more realistic biology where genes adaptively orchestrate each other. This paper applied the framework of dynamic Bayesian networks to infer adaptive gene interactions by identifying individual transition networks between pairs of consecutive times. Due to high computational burden of inferring the interconnection patterns among all genes over time, we designed a parallelizable inference algorithm to make feasible the task. We validated our approach by two clinical studies: yellow fever vaccination and mechanical periodontal therapy. The inferred dynamic networks achieved more than 90% predictive accuracy, a significant improvement when compared to stationary models (p?<?0.05). The adaptive models can help explain the induction of innate immunology in greater details after yellow fever vaccination and interpret the anti-inflammatory effect of mechanical periodontal therapy.  相似文献   

3.
The massive scale and variability of microarray gene data creates new and challenging problems of signal extraction, gene clustering, and data mining, especially for temporal gene profiles. Many data mining methods for finding interesting gene expression patterns are based on thresholding single discriminants, e.g. the ratio of between-class to within-class variation or correlation to a template. Here a different approach is introduced for extracting information from gene microarrays. The approach is based on multiple objective optimization and we call it Pareto front analysis (PFA). This method establishes a ranking of genes according to estimated probabilities that each gene is Pareto-optimal, i.e., that it lies on the Pareto front of the multiple objective scattergram. Both a model-driven Bayesian Pareto method and a data-driven non-parametric Pareto method, based on rank-order statistics, are presented. The methods are illustrated for two gene microarray experiments.  相似文献   

4.
由于基因表达谱数据的高噪声、高维性、高冗余以及数据分布不均匀等特点使得在分析过程中仍然有很多挑战性问题。基于该目的,将一种无监督学习方法--非负矩阵分解方法,应用到基因表达谱数据中,挖掘出与AD相关的信息基因。然而标准NMF算法其效率较低,并且在基因表达数据的应用有效性低。为了适应该领域的需求,采用了Alpha-NMF算法。该算法能够有效的克服标准NMF算法的缺陷,获得较好的实验结果。多次运行Alpha-NMF算法,选取分类准确率和稳定性最优的实验结果,对其集合基因设定一阈值,筛选出集合基因中大于该阈值的信息基因。最后通过基因功能分类以及生物功能结构图来验证所提炼出的特异性基因的有用性和可靠性。  相似文献   

5.
Iterative normalization of cDNA microarray data   总被引:4,自引:0,他引:4  
Describes an approach to normalizing microarray expression data. The novel feature is to unify the tasks of estimating normalization coefficients and identifying the control gene set. Unification is realized by constructing a window function over the scatter plot defining the subset of constantly expressed genes and by affecting optimization using an iterative procedure. The structure of window function gates contributions to the control gene set used to estimate normalization coefficients. This window measures the consistency of the matched neighborhoods in the scatter plot and provides a means of rejecting control gene outliers. The recovery of normalizational regression and control gene selection are interleaved and are realized by applying coupled operations to the mean square error function. In this way, the two processes bootstrap one another. We evaluate the technique on real microarray data from breast cancer cell lines and complement the experiment with a data cluster visualization study  相似文献   

6.
Graphical representation may provide effective means of making sense of the complexity and sheer volume of data produced by DNA microarray experiments that monitor the expression patterns of thousands of genes simultaneously. The ability to use ldquoabstractrdquo graphical representation to draw attention to areas of interest, and more in-depth visualizations to answer focused questions, would enable biologists to move from a large amount of data to particular records they are interested in, and therefore, gain deeper insights in understanding the microarray experiment results. This paper starts by providing some background knowledge of microarray experiments, and then, explains how graphical representation can be applied in general to this problem domain, followed by exploring the role of visualization in gene expression data analysis. Having set the problem scene, the paper then examines various multivariate data visualization techniques that have been applied to microarray data analysis. These techniques are critically reviewed so that the strengths and weaknesses of each technique can be tabulated. Finally, several key problem areas as well as possible solutions to them are discussed as being a source for future work.  相似文献   

7.
An evolutionary approach for gene expression patterns   总被引:1,自引:0,他引:1  
This study presents an evolutionary algorithm, called a heterogeneous selection genetic algorithm (HeSGA), for analyzing the patterns of gene expression on microarray data. Microarray technologies have provided the means to monitor the expression levels of a large number of genes simultaneously. Gene clustering and gene ordering are important in analyzing a large body of microarray expression data. The proposed method simultaneously solves gene clustering and gene-ordering problems by integrating global and local search mechanisms. Clustering and ordering information is used to identify functionally related genes and to infer genetic networks from immense microarray expression data. HeSGA was tested on eight test microarray datasets, ranging in size from 147 to 6221 genes. The experimental clustering and visual results indicate that HeSGA not only ordered genes smoothly but also grouped genes with similar gene expressions. Visualized results and a new scoring function that references predefined functional categories were employed to confirm the biological interpretations of results yielded using HeSGA and other methods. These results indicate that HeSGA has potential in analyzing gene expression patterns.  相似文献   

8.
In recent years,microarray technology has been widely applied in biological and clinical studies for simultaneous monitoring of gene expression in thousands of genes.Gene clustering analysis is found useful for discovering groups of correlated genes potentially co-regulated or associated to the disease or conditions under investigation.Many clustering methods including k-means,fuzzy c-means,and hierarchical clustering have been widely used in literatures.Yet no comprehensive comparative study has been performed to evaluate the effectiveness of these methods,specially,in yeast saccharomyces cerevislae.In this paper,these three gene clustering methods are compared.Classification accuracy and CPU time cost are employed for measuring performance of these algorithms.Our results show that hierarchical clustering outperforms k-means and fuzzy c-means clustering.The analysis provides deep insight to the complicated gene clustering problem of expression profile and serves as a practical guideline for routine microarray cluster analysis of gene expression.  相似文献   

9.
构造高精度分类模型是对基因表达谱数据分析的主要研究方向之一,但提取不同特征空间产生的分类效果有很大差异,而集成分类系统在一定程度上提高了分类结果的可靠性和稳定性。构建基于PCA和NMF集成分量系统,并基于分析混合矩阵A的hinton图生物学意义建立集成独立分量选择系统,成功运用到基因表达谱分析,实验结果表明,集成分量分类系统优于单个分类器。  相似文献   

10.
Cluster analysis of gene expression data from a cDNA microarray is useful for identifying biologically relevant groups of genes. However, finding the natural clusters in the data and estimating the correct number of clusters are still two largely unsolved problems. In this paper, we propose a new clustering framework that is able to address both these problems. By using the one-prototype-take-one-cluster (OPTOC) competitive learning paradigm, the proposed algorithm can find natural clusters in the input data, and the clustering solution is not sensitive to initialization. In order to estimate the number of distinct clusters in the data, we propose a cluster splitting and merging strategy. We have applied the new algorithm to simulated gene expression data for which the correct distribution of genes over clusters is known a priori. The results show that the proposed algorithm can find natural clusters and give the correct number of clusters. The algorithm has also been tested on real gene expression changes during yeast cell cycle, for which the fundamental patterns of gene expression and assignment of genes to clusters are well understood from numerous previous studies. Comparative studies with several clustering algorithms illustrate the effectiveness of our method.  相似文献   

11.
Constructing a classifier based on microarray gene expression data has recently emerged as an important problem for cancer classification. Recent results have suggested the feasibility of constructing such a classifier with reasonable predictive accuracy under the circumstance where only a small number of cancer tissue samples of known type are available. Difficulty arises from the fact that each sample contains the expression data of a vast number of genes and these genes may interact with one another. Selection of a small number of critical genes is fundamental to correctly analyze the otherwise overwhelming data. It is essential to use a multivariate approach for capturing the correlated structure in the data. However, the curse of dimensionality leads to the concern about the reliability of selected genes. Here, we present a new gene selection method in which error and repeatability of selected genes are assessed within the context of M-fold cross-validation. In particular, we show that the method is able to identify source variables underlying data generation.  相似文献   

12.
针对不同扩展目标产生的量测密度差别较大时的量测集划分问题,为扩展目标概率假设密度(PHD)滤波器提出了一种基于共享最近邻(SNN)相似度的量测集划分算法。量测间的SNN相似度可体现量测在量测空间局部分布情况,考虑了量测周围的量测信息,因此提出的SNN相似度划分法能够较好地划分量测密度差别较大的量测集,进而提高了扩展目标的跟踪性能,且基于提出的划分算法的PHD滤波器计算量也所减少。  相似文献   

13.
cDNA生物芯片表达数据广泛用于生物医学研究,利用计算机对其进行处理还有很多挑战性课题。该文提出了一种新的基于不变基因的多类生物芯片监督型集合cDNA表达数据标准化方法。在达到标准化的同时,该方法也可直接用于基因表达数据的特征选择,实验证明效果较好。  相似文献   

14.
聚类分析是基因表达数据分析研究的主要技术之一,其算法的基本出发点在于根据对象间相似度将对象划分为不同的类,选择适当的相似性度量准则是获得有效聚类结果的关键。采用预处理过的基因数据集在不同相似性度量准则下进行的不同聚类算法的聚类分析,并得到聚类结果评价。其中算法本身的缺陷及距离相似性度量的局限性都是影响结果评价的因素,为了获得更有效的聚类结果,改进相关聚类算法并提出了一种比例相似性度量准则。  相似文献   

15.
Wusan Granule, the Ⅲ new drug, is composed of such traditional Chinese medicines as Radix Polygorri Multiflori Praeparata, Radix Noginseng and the like. In 2001, State Drug Administration of China approved clinical trial for Wusan Granule. Wusan Granule possesses the superiority of safety and effectiveness. Its main functions are to supplement Qi and nourish Yin, and to dissipate mass. The animal experiment showing the recipe for Wusan Granule can not only suppress Lewis lung carcinom…  相似文献   

16.
To screen Wusan Granule anti-tumor related target gene using cDNA microarray technique, both mRNA from Lewis lung carcinoma tissues treated by Wusan Granule and untreated control are reversibly transcribed to prepare cDNA probes which are labeled by Cy5 and Cy3. Then, the probes are hybridized to the mice cDNA microarray type MGEC-20S. After hybridization, the cDNA microarray is scanned by ScanArray 3 000 scanner and the data is analyzed by ImaGene 3 software to screen the differentially expressed genes. There are 45 differentially expressed genes including 18 known genes and 27 unknown genes between the two groups, and among them, 20 elevated genes and 25 reduced genes are identified. Additionally, the genes related to invasion and metastasis of malignant carcinomas are down-regulated and the genes related to apoptosis are up-regulated. The cDNA microarray technique is a high-throughput approach to screen the Wusan Granule anti-tumor related target genes, which allow us to explore the molecular biological mechanism on a genomic scale.  相似文献   

17.
18.
The core diameters of six graded-index fiber from four different fiber manufacturers were compared using the transmitted near-field (TNF), the refracted near-field (RNF), and the transverse-interferometric (TI) measurement methods. This study was part of an effort to develop a standardized, industry-wide definition of core diameter and to determine the precision of interlaboratory core-diameter measurements using different measurement techniques. For fibers with smooth index-of-refraction profiles, all three methods were in good agreement (< 1.0-mum difference). Substantial differences between the transmitted near field and the two profiling methods (RNF and TI) were observed for fibers having step structure near the core-cladding boundary. In an attempt to resolve these differences, splice-loss measurements were used as an indicator of diameter differences. These experiments suggested that curve-fitting routines should be applied to the two profiling methods. A comparison of the curve-fitted profile data with measured transmitted near-field data at points 2 percent above the baseline produced values for the diameters which agreed to within 1 μm for all of the fibers measured.  相似文献   

19.
Many existing clustering algorithms have been used to identify coexpressed genes in gene expression data. These algorithms are used mainly to partition data in the sense that each gene is allowed to belong only to one cluster. Since proteins typically interact with different groups of proteins in order to serve different biological roles, the genes that produce these proteins are therefore expected to coexpress with more than one group of genes. In other words, some genes are expected to belong to more than one cluster. This poses a challenge to gene expression data clustering as there is a need for overlapping clusters to be discovered in a noisy environment. For this task, we propose an effective information theoretical approach, which consists of an initial clustering phase and a second reclustering phase, in this paper. The proposed approach has been tested with both simulated and real expression data. Experimental results show that it can improve the performances of existing clustering algorithms and is able to effectively uncover interesting patterns in noisy gene expression data so that, based on these patterns, overlapping clusters can be discovered.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号