首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Recent progress in medical sciences has led to an explosive growth of data. Due to its inherent complexity and diversity, mining such volumes of data to extract relevant knowledge represents an enormous challenge and opportunity. Interactive pattern discovery and visualization systems for biomedical data mining have received relatively little attention. Emphasis has been traditionally placed on automation and supervised classification problems. Based on self-adaptive neural networks and pattern-validation statistical tools, this paper presents a user-friendly platform to support biomedical pattern discovery and visualization. It has been tested on several types of biomedical data, such as dermatology and cardiology data sets. The results indicate that in comparison to traditional techniques, such as Kohonen Maps, this platform may significantly improve the effectiveness and efficiency of pattern discovery and classification tasks, including problems described by several classes. Furthermore, this study shows how the combination of graphical and statistical tools may make these patterns more meaningful.  相似文献   

2.
Graphical representation may provide effective means of making sense of the complexity and sheer volume of data produced by DNA microarray experiments that monitor the expression patterns of thousands of genes simultaneously. The ability to use ldquoabstractrdquo graphical representation to draw attention to areas of interest, and more in-depth visualizations to answer focused questions, would enable biologists to move from a large amount of data to particular records they are interested in, and therefore, gain deeper insights in understanding the microarray experiment results. This paper starts by providing some background knowledge of microarray experiments, and then, explains how graphical representation can be applied in general to this problem domain, followed by exploring the role of visualization in gene expression data analysis. Having set the problem scene, the paper then examines various multivariate data visualization techniques that have been applied to microarray data analysis. These techniques are critically reviewed so that the strengths and weaknesses of each technique can be tabulated. Finally, several key problem areas as well as possible solutions to them are discussed as being a source for future work.  相似文献   

3.
Serial Analysis of Gene Expression (SAGE) is a powerful tool to analyze whole-genome expression profiles. SAGE data, characterized by large quantity and high dimensions, need reducing their dimensions and extract feature to improve the accuracy and efficiency when they are used for pattern recognition and clustering analysis. A Poisson Model-based Kernel (PMK) was proposed based on the Poisson distribution of the SAGE data. Kernel Principle Component Analysis (KPCA) with PMK was proposed and used in feature-extract analysis of mouse retinal SAGE data. The computational results show that this algorithm can extract feature effectively and reduce dimensions of SAGE data.  相似文献   

4.
Space-alternating generalized expectation-maximization algorithm   总被引:9,自引:0,他引:9  
The expectation-maximization (EM) method can facilitate maximizing likelihood functions that arise in statistical estimation problems. In the classical EM paradigm, one iteratively maximizes the conditional log-likelihood of a single unobservable complete data space, rather than maximizing the intractable likelihood function for the measured or incomplete data. EM algorithms update all parameters simultaneously, which has two drawbacks: 1) slow convergence, and 2) difficult maximization steps due to coupling when smoothness penalties are used. The paper describes the space-alternating generalized EM (SAGE) method, which updates the parameters sequentially by alternating between several small hidden-data spaces defined by the algorithm designer. The authors prove that the sequence of estimates monotonically increases the penalized-likelihood objective, derive asymptotic convergence rates, and provide sufficient conditions for monotone convergence in norm. Two signal processing applications illustrate the method: estimation of superimposed signals in Gaussian noise, and image reconstruction from Poisson measurements. In both applications, the SAGE algorithms easily accommodate smoothness penalties and converge faster than the EM algorithms  相似文献   

5.
针对传统关联规则可视化挖掘方法不利于处理多值属性数据、缺乏展现数据间的频繁模式和关联模式以及效率低下等问题,提出了基于KAF因子和CHF因子的Apriori改进算法进行多值属性关联规则挖掘,实现了一种新的基于概念格的多值属性关联规则可视化方法.运用概念格理论对多值属性数据进行了重新定义和分类,建立了较为完整的挖掘过程参数调整策略,方便用户选择关键属性值进行规则挖掘分析,提高了算法运行速度和挖掘效率.以概念格结构将多值数据组织起来,实现了对频繁项集的可视化展示,以及关联规则的多模式可视化展示.实验结果表明,改进后的挖掘算法具有更好的性能,所提出的可视化形式和已有成果相比具有良好的展现效果.  相似文献   

6.
Practitioners' decision for mechanical aid discontinuation is a challenging task that involves a complete knowledge of a great number of clinical parameters, as well as its evolution in time. Recently, an increasing interest on respiratory pattern variability as an extubation readiness indicator has appeared. Reliable assessment of this variability involves a set of signal processing and pattern recognition techniques. This paper presents a suitability analysis of different methods used for breathing pattern complexity assessment. The contribution of this analysis is threefold: 1) to serve as a review of the state of the art on the so-called weaning problem from a signal processing point of view; 2) to provide insight into the applied processing techniques and how they fit into the problem; 3) to propose additional methods and further processing in order to improve breathing pattern regularity assessment and weaning readiness decision. Results on experimental data show that sample entropy outperforms other complexity assessment methods and that multidimensional classification does improve weaning prediction. However, the obtained performance may be objectionable for real clinical practice, a fact that paves the way for a multimodal signal processing framework, including additional high-quality signals and more reliable statistical methods.  相似文献   

7.
构造性覆盖算法的知识发现方法研究   总被引:2,自引:0,他引:2  
张旻  张铃 《电子与信息学报》2006,28(7):1322-1326
该文提出一种新的基于构造性覆盖算法的知识发现方法。由于覆盖网络构造方法的特殊性,使得形成的每个覆盖领域都很有价值,对覆盖领域内样本分析能挖掘出数据内在的知识,且可以根据需求构造不同的覆盖网络,形成对数据的多侧面的分析;实验结果表明覆盖算法用于知识发现的方法是有效可行的。  相似文献   

8.
性能监控对网格系统的平稳运行、充分发挥网格系统的能效性起到了重要作用.文章针对校园网格,分析和总结其所具有的特点.介绍了基于校园网格的性能监控与分析系统的整体架构,以及该系统中网格资源性能数据采集、存贮、可视化监视、节点资源性能预报和网格规律性知识挖掘等模块的具体设计与实现.  相似文献   

9.
The above-titled paper by T.-T.Y. Lin and D.P. Siewiorek (see ibid., vol.39, p.419-32, Oct. 1990) used traditional statistical analysis to demonstrate the superiority of the proposed dispersion frame technique. The purpose was to distinguish between transient and intermittent errors and predict the occurrence of intermittent errors. It is shown here that those traditional statistical methods were too traditional, since they involved fitting a distribution to data which were not identically distributed. Appropriate statistical techniques for fitting models to such nonstationary data are briefly discussed, and reasons are proffered for the persistence of too-traditional statistical methods in the reliability literature  相似文献   

10.
11.
Clinical medicine is facing a challenge of knowledge discovery from the growing volume of data. In this paper, a data mining algorithm (the G-algorithm) is proposed for extraction of robust rules that can be used in clinical practice for better understanding and prevention of unwanted medical events. The G-algorithm is applied to a data set obtained for children born with a malformation of the heart (univentricular heart). As a result of the Fontan surgical procedure, designed to palliate the children, 10%-35% of patients post-operatively develop an arrhythmia known as intra-atrial re-entrant tachycardia. There is an obvious need to identify those children that may develop the tachycardia before the surgery is performed. Prior attempts to identify such children with statistical techniques have been unrewarding. The G-algorithm shows that there exists an unambiguous relationship between measurable features and the tachycardia. The data set used in this study shows that, for 78.08% of infants, the occurrence of tachycardia can be accurately predicted. The authors' prior computational experience with diverse medical data sets indicates that the percentage of accurate predictions may become even higher if data on additional features is collected for a larger data set  相似文献   

12.
Advances in digital imaging modalities as well as other diagnosis and therapeutic techniques have generated a massive amount of diverse data for clinical research. The purpose of this study is to investigate and implement a new intuitive and space-conscious visualization framework, called DBMap, to facilitate efficient multidimensional data visualization and knowledge discovery against the large-scale data warehouses of integrated image and nonimage data. The DBMap framework is built upon the TreeMap concept. TreeMap is a space constrained graphical representation of large hierarchical data sets, mapped to a matrix of rectangles, whose size and color represent interested database fields. It allows the display of a large amount of numerical and categorical information in limited real estate of the computer screen with an intuitive user interface. DBMap has been implemented and integrated into a large brain research data warehouse to support neurologic and neuroradiologic research at the University of California, San Francisco Medical Center. For imaging specialists and clinical researchers, this novel DBMap framework facilitates another way to better explore and classify the hidden knowledge embedded in medical image data warehouses.  相似文献   

13.
The postgenomics era has witnessed a rapid change in biological methods for knowledge elucidation and pharmacological approaches to biomarker discovery. Differential expression of proteins in health and disease holds the key to early diagnosis and accelerated drug discovery. This approach, however, has also brought an explosion of data complexity not mirrored by existing progress in proteome informatics. It has become apparent that the task is greater than that can be tackled by individual laboratories alone and large-scale open collaborations of the new human proteome organization (HUPO) have highlighted major challenges concerning the integration and cross-validation of results across different laboratories. This paper describes the state-of-the-art proteomics workflows (two-dimensional gel electrophoresis, liquid chromatography, and mass spectrometry) and their utilization by the participants of the HUPO initiatives towards comprehensive mapping of the brain, liver, and plasma proteomes. Particular emphasis is given to the limitations of the underlying data analysis techniques for large-scale collaborative proteomics. Emerging paradigms including statistical data normalization, direct image registration, spectral libraries, and high-throughput computation with Web-based bioinformatics services are discussed. It is envisaged that these methods will provide the basis for breaking the bottleneck of large-scale automated proteome mapping and biomarker discovery.  相似文献   

14.
In this paper, we address the problem of deriving adequate detection and classification schemes to fully exploit the information available in a sequence of SAR images. In particular, we address the case of detecting a step reflectivity change pattern against a constant pattern. Initially we propose two different techniques, based on a maximum likelihood approach, that make different use of prior knowledge on the searched pattern. They process the whole sequence to achieve optimal discrimination capability between regions affected and not affected by a step change. The first technique (KSP-detector) assumes a complete knowledge of the pattern of change, While the second one (USP-detector) is based on the assumption of a totally unknown pattern. A fully analytical expression of the detection performances of both techniques is obtained, which shows the large improvement achievable using longer sequences instead of only two images. By comparing the two techniques it is also apparent that KSP achieves better performance, but the USP-detector is more robust. As a compromise solution, a third technique is then developed, assuming a partial knowledge of the pattern of change, and its performance is compared to the previous ones. The practical effectiveness of the technique on real data is shown by applying the USP-detector to a sequence of 10 ERS-1 SAR images of forest and agricultural areas, which is also used to validate the theoretical results  相似文献   

15.
In this paper, we propose a knowledge discovery method based on the fuzzy set theory to help elders with plant cultivation. Initially, the fuzzy sets are constructed by using the feature selection and statistical interval estimation. The min-max inference and the center of gravity defuzzification method are then used to output a candidate pattern set. Finally, a pattern discovery is adopted to obtain the patterns from the candidate set for the cultivation suggestions by considering the frequency weight and user's experience. In order to demonstrate the performance of our method in planting systems, we conduct a clicks-and-mortar cultivation platform, namely Eden Garden, for the elderly lifestyles of health and sustainability (LOHAS). The experimental results show that the accuracy rate of our knowledge discovery method can reach up to 85%. Moreover, the results of the LOHAS index scale table present that the happiness of the elders is increasing while the elders are using our proposed method.  相似文献   

16.
17.
The rapid advancement of DNA microarray technology has revolutionalized genetic research in bioscience. Due to the enormous amount of gene expression data generated by such technology, computer processing and analysis of such data has become indispensable. In this paper, we present a computational framework for the extraction, analysis and visualization of gene expression data from microarray experiments. A novel, fully automated, spot segmentation algorithm for DNA microarray images, which makes use of adaptive thresholding, morphological processing and statistical intensity modeling, is proposed to: (i) segment the blocks of spots, (ii) generate the grid structure, and (iii) to segment the spot within each subregion. For data analysis, we propose a binary hierarchical clustering (BHC) framework for the clustering of gene expression data. The BHC algorithm involves two major steps. Firstly, the fuzzy C-means algorithm and the average linkage hierarchical clustering algorithm are used to split the data into two classes. Secondly, the Fisher linear discriminant analysis is applied to the two classes to assess whether the split is acceptable. The BHC algorithm is applied to the sub-classes recursively and ends when all clusters cannot be split any further. BHC does not require the number of clusters to be known in advance. It does not place any assumption about the number of samples in each cluster or the class distribution. The hierarchical framework naturally leads to a tree structure representation for effective visualization of gene expressions.  相似文献   

18.
应作斌  马建峰  崔江涛 《通信学报》2015,36(12):178-189
基于密文策略的属性加密被认为适用于云存储的环境,但当数据拥有者需要更新访问策略时,现有的更新方式因受数据的规模和属性集的大小的限制,会使数据拥有者增加相应的计算开销和通信开销。同时,以明文形式存放在云端的访问策略也会造成用户数据的隐私泄露。针对以上2个问题,提出了一种支持动态策略更新的半策略隐藏属性加密方案,使用所提方案进行策略更新时,用户的计算开销减少,大量的计算由云服务器承担。由于使用了半策略隐藏,用户的具体属性值不会泄露给其他任何第三方,有效保护了用户的隐私。此外,所提方案可以支持任何形式的策略更新,在标准模型下证明了方案是自适应选择明文攻击(CPA)安全的。  相似文献   

19.
The growing volume of information poses interesting challenges and calls for tools that discover properties of data. Data mining has emerged as a discipline that contributes tools for data analysis, discovery of new knowledge, and autonomous decisionmaking. In this paper, the basic concepts of rough set theory and other aspects of data mining are introduced. The rough set theory offers a viable approach for extraction of decision rules from data sets. The extracted rules can be used for making predictions in the semiconductor industry and other applications. This contrasts other approaches such as regression analysis and neural networks where a single model is built. One of the goals of data mining is to extract meaningful knowledge. The power, generality, accuracy, and longevity of decision rules can be increased by the application of concepts from systems engineering and evolutionary computation introduced in this paper. A new rule-structuring algorithm is proposed. The concepts presented in the paper are illustrated with examples  相似文献   

20.
Biomedical research once involved building complex theories upon relatively small amounts of experimental data. The field of bioinformatics has posed many computational problems (bioinformatics can be broadly defined as the interface between biology and computational sciences). The field has stimulated synergetic research and development of state-of-the-art techniques in the areas of data mining, statistics, imaging/pattern analysis, and visualization. By applying these techniques to gene and protein sequence information embedded in biological systems. Signal processing (SP) techniques have been applied most everywhere in bioinformatics and will continue to play an important role in the study of biomedical problems. The goal of this article is to demonstrate to the SP community the potential of SP tools in uncovering complex biological phenomena.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号