首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 662 毫秒
1.
Routine Discovery of Complex Genetic Models using Genetic Algorithms   总被引:1,自引:0,他引:1  
Simulation studies are useful in various disciplines for a number of reasons including the development and evaluation of new computational and statistical methods. This is particularly true in human genetics and genetic epidemiology where new analytical methods are needed for the detection and characterization of disease susceptibility genes whose effects are complex, nonlinear, and partially or solely dependent on the effects of other genes (i.e. epistasis or gene-gene interaction). Despite this need, the development of complex genetic models that can be used to simulate data is not always intuitive. In fact, only a few such models have been published. We have previously developed a genetic algorithm approach to discovering complex genetic models in which two single nucleotide polymorphisms (SNPs) influence disease risk solely through nonlinear interactions. In this paper, we extend this approach for the discovery of high-order epistasis models involving three to five SNPs. We demonstrate that the genetic algorithm is capable of routinely discovering interesting high-order epistasis models in which each SNP influences risk of disease only through interactions with the other SNPs in the model. This study opens the door for routine simulation of complex gene-gene interactions among SNPs for the development and evaluation of new statistical and computational approaches for identifying common, complex multifactorial disease susceptibility genes.  相似文献   

2.
人工神经网络在单核苷酸多态性(SNP)检测中的应用   总被引:3,自引:3,他引:0  
人类单核苷酸多态性(SNPs)国际研究计划和中华民族单核苷酸多态性项目已经进行了两年多时间,由于研究工作耗资巨大,使得这两个项目都进展缓慢。目前人类单核苷酸多态性(SNPs)研究主要都是依靠生物实验来确定,借助计算机和数学算法来预报人类单核苷酸多态性(SNPs)是生物信息学领域的一个难题。在大量文献分析的基础上,试用人工神经网络算法来预报人类单核苷酸多态性(SNPs),通过构建适当的数学模型,预报人类单核苷酸多态性(SNPs)的准确率达到了73%,在生物实验中参考数学模型预报结果,可以节省大量研究资源。  相似文献   

3.
In order to take into account the complex genomic distribution of SNP variations when identifying chromosomal regions with significant SNP effects, a single nucleotide polymorphism (SNP) association scan statistic was developed. To address the computational needs of genome wide association (GWA) studies, a fast Java application, which combines single-locus SNP tests and a scan statistic for identifying chromosomal regions with significant clusters of significant SNP effects, was developed and implemented. To illustrate this application, SNP associations were analyzed in a pharmacogenomic study of the blood pressure lowering effect of thiazide-diuretics (N=195) using the Affymetrix Human Mapping 100 K Set. 55,335 tagSNPs (pair-wise linkage disequilibrium R2<0.5) were selected to reduce the frequency correlation between SNPs. A typical workstation can complete the whole genome scan including 10,000 permutation tests within 3 h. The most significant regions locate on chromosome 3, 6, 13 and 16, two of which contain candidate genes that may be involved in the underlying drug response mechanism. The computational performance of ChromoScan-GWA and its scalability were tested with up to 1,000,000 SNPs and up to 4000 subjects. Using 10,000 permutations, the computation time grew linearly in these datasets. This scan statistic application provides a robust statistical and computational foundation for identifying genomic regions associated with disease and provides a method to compare GWA results even across different platforms.  相似文献   

4.
TagSNP selection, which aims to select a small subset of informative single nucleotide polymorphisms (SNPs) to represent the whole large SNP set, has played an important role in current genomic research. Not only can this cut down the cost of genotyping by filtering a large number of redundant SNPs, but also it can accelerate the study of genome-wide disease association. In this paper, we propose a new hybrid method called CMDStagger that combines the ideas of the clustering and the graph algorithm, to find the minimum set of tagSNPs. The proposed algorithm uses the information of the linkage disequilibrium association and the haplotype diversity to reduce the information loss in tagSNP selection, and has no limit of block partition. The approach is tested on eight benchmark datasets from Hapmap and chromosome 5q31. Experimental results show that the algorithm in this paper can reduce the selection time and obtain less tagSNPs with high prediction accuracy. It indicates that this method has better performance than previous ones.  相似文献   

5.
Multiple sclerosis affects more than 2.5 million people worldwide. Although multiple sclerosis was described almost 150 years ago, there are many knowledge gaps regarding its etiology, diagnosis, prognosis, and pathogenesis. Multiple sclerosis is an inflammatory, demyelinating, neurodegenerative disease of the CNS. During the last several decades, experimental models of multiple sclerosis have contributed to our understanding of the inflammatory disease mechanisms and have aided drug testing and development. However, little is known about the neurodegenerative mechanisms that operate during the evolution of the disease. Currently, all therapeutic approaches are primarily based on the inflammatory aspect of the disease. During the last decade, proteomics has emerged as a promising tool for revealing molecular pathways as well as identifying and quantifying differentially expressed proteins. Therefore, proteomics may be used for the discovery of biomarkers, potential drug targets, and new regulatory mechanisms. To date, a considerable number of proteomics studies have been conducted on samples from experimental models and patients with multiple sclerosis. These data form a solid base for further careful analysis and validation.  相似文献   

6.
Interaction detection in large-scale genetic association studies has attracted intensive research interest, since many diseases have complex traits. Various approaches have been developed for finding significant genetic interactions. In this article, we propose a novel framework SRMiner to detect interacting susceptible and protective genotype patterns. SRMiner can discover not only probable combination of single nucleotide polymorphisms (SNPs) causing diseases but also the corresponding SNPs suppressing their pathogenic functions, which provides a better prospective to uncover the underlying relevance between genetic variants and complex diseases. We have performed extensive experiments on several real Wellcome Trust Case Control Consortium (WTCCC) datasets. We use the pathway-based and the protein-protein interaction (PPI) network-based evaluation methods to verify the discovered patterns. The results show that SRMiner successfully identifies many disease-related genes verified by the existing work. Furthermore, SRMiner can also infer some uncomfirmed but highly possible disease-related genes.  相似文献   

7.
Multiple sclerosis is characterized by inflammatory demyelination and axonal loss as pathophysiological correlates of relapsing activity and progressive development of clinical disability. The molecular processes involved in this pathogenesis are still unclear as they are quite complex and heterogeneous. In this article we present protein expression analysis of brain and spinal cord tissues from different models of murine experimental autoimmune encephalomyelitis (EAE), the most commonly used animal model for multiple sclerosis. We observed a number of EAE-specific protein expression and PTM differences. Proteome analysis was extended to multiple sclerosis specimens in order to validate the EAE findings. Our findings suggest the regulation of a number of proteins that shed light on the molecular mechanisms of the disease processes taking place in EAE and multiple sclerosis. We found consistent modulation of proteins including serum amyloid P component, sirtuin 2, dihydropyrimidinase-related protein family proteins, stathmin 1, creatine kinase B and chloride intracellular channel protein 1. Functional classification of the proteins by database and the literature mining reveals their association with neuronal development and myelinogenesis, suggesting possible disease processes that mediate neurodegeneration.  相似文献   

8.
This paper presents different artificial intelligence (AI) techniques for crack identification in curvilinear beams based on changes in vibration characteristics. Vibration analysis has been performed by applying the finite element method (FEM) to compute natural frequencies and frequency response functions (FRFs) for intact and damaged beams. The analysis reveals the changes in natural frequencies and amplitudes of FRFs of the beams for cracks of different sizes at different locations. These changes are used as input data for single and multiple artificial neural networks (ANN) and multiple adaptive neuro-fuzzy inference systems (ANFIS) in order to predict the size of the crack and its location. To avoid large models, the principal component analysis (PCA) approach has been carried out to reduce the computed FRFs data. The analysis of different techniques shows that the average prediction errors in the multiple ANN models is less than those in the single ANN model and in the multiple ANFIS. It is shown that the cracks longer than 5?mm can be located with satisfactory accuracy, even if the input data are corrupted with various level of noise. Multiple ANFIS is adopted to construct a more reliable and less sensitive model for noise excitation.  相似文献   

9.
赵婧  魏彬 《计算机工程与科学》2016,38(11):2328-2334
研究复杂疾病与SNP之间的相关性是生物信息学最为重要的任务之一,然而基因分型的巨大花费却限制了其发展及应用。因此,选择部分有代表性的SNP(即标签SNP选择问题)进行研究,从而降低研究所需费用就显得十分必要。近年来,已有一些算法被提出用于解决该问题,但是大多数方法在预测精度及标签选择数目等指标方面仍未能达到实际应用的需求。据此,设计了一种前向矩阵法用于标签SNP的选择,并用改进的PSO算法对非标签SNP进行预测。最后通过大量数据集上的实验表明了算法与常用方法相比可选择更少的标签,同时可获得更高的预测精度,即在性能方面有了明显的提升,更适合于标签SNP选择问题的研究。  相似文献   

10.
Our rapidly growing knowledge regarding genetic variation in the human genome offers great potential for understanding the genetic etiology of disease. This, in turn, could revolutionize detection, treatment, and in some cases prevention of disease. While genes for most of the rare monogenic diseases have already been discovered, most common diseases are complex traits, resulting from multiple gene–gene and gene-environment interactions. Detecting epistatic genetic interactions that predispose for disease is an important, but computationally daunting, task currently facing bioinformaticists. Here, we propose a new evolutionary approach that attempts to hill-climb from large sets of candidate epistatic genetic features to smaller sets, inspired by Kauffman’s “random chemistry” approach to detecting small auto-catalytic sets of molecules from within large sets. Although the algorithm is conceptually straightforward, its success hinges upon the creation of a fitness function able to discriminate large sets that contain subsets of interacting genetic features from those that don’t. Here, we employ an approximate and noisy fitness function based on the ReliefF data mining algorithm. We establish proof-of-concept using synthetic data sets, where individual features have no marginal effects. We show that the resulting algorithm can successfully detect epistatic pairs from up to 1,000 candidate single nucleotide polymorphisms in time that is linear in the size of the initial set, although success rate degrades as heritability declines. Research continues into seeking a more accurate fitness approximator for large sets and other algorithmic improvements that will enable us to extend the approach to larger data sets and to lower heritabilities.  相似文献   

11.
刘海  吴振强  彭长根  雷秀娟 《软件学报》2019,30(4):1094-1105
人类基因测序技术的快速发展,测序成本大幅降低,使基因数据得到广泛的应用,在全基因组的单核苷酸多态性与疾病关联研究中,单核苷酸多态性与患者的身份、表型和血缘关系等敏感信息相关联,单核苷酸多态性连锁不平衡容易导致患者的隐私信息泄露.为此,基于单核苷酸多态性连锁不平衡相关系数,提出矩阵差分隐私保护模型以实现基因数据和单核苷酸多态性连锁不平衡的隐私保护,同时确保基因数据具有一定的效用.该模型可以实现单核苷酸多态性连锁不平衡下全基因组关联研究中基因数据隐私与效用的权衡,并对单核苷酸多态性连锁不平衡下的基因隐私保护具有促进作用.  相似文献   

12.
This paper proposes a new approach for solving the problem of obstacle avoidance during manipulation tasks performed by redundant manipulators. The developed solution is based on a double neural network that uses Q-learning reinforcement technique. Q-learning has been applied in robotics for attaining obstacle free navigation or computing path planning problems. Most studies solve inverse kinematics and obstacle avoidance problems using variations of the classical Jacobian matrix approach, or by minimizing redundancy resolution of manipulators operating in known environments. Researchers who tried to use neural networks for solving inverse kinematics often dealt with only one obstacle present in the working field. This paper focuses on calculating inverse kinematics and obstacle avoidance for complex unknown environments, with multiple obstacles in the working field. Q-learning is used together with neural networks in order to plan and execute arm movements at each time instant. The algorithm developed for general redundant kinematic link chains has been tested on the particular case of PowerCube manipulator. Before implementing the solution on the real robot, the simulation was integrated in an immersive virtual environment for better movement analysis and safer testing. The study results show that the proposed approach has a good average speed and a satisfying target reaching success rate.  相似文献   

13.
Multiple sclerosis is an inflammatory-mediated demyelinating disorder most prevalent in young Caucasian adults. The various clinical manifestations of the disease present several challenges in the clinic in terms of diagnosis, monitoring disease progression and response to treatment. Advances in MS-based proteomic technologies have revolutionized the field of biomarker research and paved the way for the identification and validation of disease-specific markers. This review focuses on the novel candidates discovered by the application of quantitative proteomics to relevant disease-affected tissues in both the human context and within the animal model of the disease known as experimental autoimmune encephalomyelitis. The role of targeted MS approaches for biomarker validation studies, such as multiple reaction monitoring will also be discussed.  相似文献   

14.
The knowledge discovery process is supported by data files information gathered from collected data sets, which often contain errors in the form of missing values. Data imputation is the activity aimed at estimating values for missing data items. This study focuses on the development of automated data imputation models, based on artificial neural networks for monotone patterns of missing values. The present work proposes a single imputation approach relying on a multilayer perceptron whose training is conducted with different learning rules, and a multiple imputation approach based on the combination of multilayer perceptron and k-nearest neighbours. Eighteen real and simulated databases were exposed to a perturbation experiment with random generation of monotone missing data pattern. An empirical test was accomplished on these data sets, including both approaches (single and multiple imputations), and three classical single imputation procedures – mean/mode imputation, regression and hot-deck – were also considered. Therefore, the experiments involved five imputation methods. The results, considering different performance measures, demonstrated that, in comparison with traditional tools, both proposals improve the automation level and data quality offering a satisfactory performance.  相似文献   

15.
In a study that examined associations between single nucleotide polymorphism (SNP) loci within the promoter of the PRTN3 gene and the autoimmune disease Wegener’s granulomatosis (WG), we implemented a self-administered pilot survey that captured participants’ demographic data, family relationships, incidence of autoimmune disease among family members, and attitudes about DNA collection. We next integrated the survey and genotype data to test associations between genotype and phenotype, to examine demographic characteristics of WG patients and their families, and to examine the robustness of the data collection approach.  相似文献   

16.
This paper presents a new algorithm to produce a near optimal mixture of experts model (MEM) architecture for a continuous mapping. The MEM is applied to a new method incorporating photon scatter for designing compensators for intensity modulated radiation therapy. The algorithm utilizes the fuzzy C-means clustering algorithm to partition data before training commences. A reduction in the size of training sets also allows the Levenberg-Marquardt algorithm to be implemented. As a result, both training time and validation error are reduced. A 71% reduction in prediction error compared with that of a single neural network is achieved.  相似文献   

17.
Artificial neural networks (ANNs) are used extensively to model unknown or unspecified functional relationships between the input and output of a “black box” system. In order to apply the generic ANN concept to actual system model fitting problems, a key requirement is the training of the chosen (postulated) ANN structure. Such training serves to select the ANN parameters in order to minimize the discrepancy between modeled system output and the training set of observations. We consider the parameterization of ANNs as a potentially multi-modal optimization problem, and then introduce a corresponding global optimization (GO) framework. The practical viability of the GO based ANN training approach is illustrated by finding close numerical approximations of one-dimensional, yet visibly challenging functions. For this purpose, we have implemented a flexible ANN framework and an easily expandable set of test functions in the technical computing system Mathematica. The MathOptimizer Professional global-local optimization software has been used to solve the induced (multi-dimensional) ANN calibration problems.  相似文献   

18.
作为简单、鲁棒的设计方法,基于继电反馈的PID控制器巳广泛应用于工业过程控制。它可以由继电反馈引起的振荡近似估计过程临界信息进行控制器的设计。多模型控制是解决系统时变、非线性、参数不确定性等复杂问题得一种有效方法。该文将继电反馈控制与多模型控制相结合,对时变、非线性的电厂主汽温系统过进行控制。首先在各个工况点应用继电反馈方法设计子控制器。然后在系统整个运行区间进行多模型自适应控制以克服非线性、时变对系统的影响。仿真表明本方法所建立的控制系统具有良好的控制品质及较强的自适应能力。  相似文献   

19.
This paper describes the development of a prototype neural network model for the free-ranging AGV route-planning problem. The vehicle planner operates in quasi-real time. A small planning horizon is set and all transport requests existing at the beginning of a planning horizon are examined. A neural network model is proposed to perform dispatching and routing tasks for the AGVs. Its goal is to satisfy the transport requests in the shortest time and in a non-conflicting manner, subject to the global manufacturing objective of maximizing throughputs. Based on Kohonen's self-organizing feature maps, we develop three efficient planning algorithms for the single and multiple AGV problems. The simulation results indicate that the proposed neural network approach gives very efficient solutions.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号