首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
针对模式分类算法不直观的问题,提出一种基于径向坐标可视化分析高维数据的方法。由最大似然原理估计高维数据的本征维数,用较少的变量结合径向坐标可视化方法对高维数据进行可视化降维分析。在径向坐标中揭示高维数据集中类别和特征间的关系,寻找基于不同特征排列顺序的最优映射,并结合多种机器学习方法对数据集进行分类。应用于UCI数据库中的6个数据集的结果表明,该方法具有较好的可视化和分类效果。  相似文献   

2.
秦绪佳  单扬洋  徐菲  郑红波  张美玉 《计算机科学》2018,45(12):262-267, 287
针对全国各省份垃圾处理方式的数据,提出一种混合可视分析方法。为了从多角度分析数据,混合U矩阵、平行坐标以及Small-Multiple 3种可视化技术,设计并实现了3种可视化视图的交互联动。首先,对数据进行聚类处理,将各省份近年的垃圾处理方式划分类别,采用SOM神经网络聚类算法实现聚类。然后,针对SOM聚类结果,采用U矩阵的方式进行可视化,并采用平行坐标描述每个聚类结果的各个属性。为了分析数据的地理属性及时序属性,采用Small-Multiple可视化技术。最后,实现多视图联动、刷新技术等交互方式,帮助用户自行探索数据,实现多视图的交互展示与分析。实验表明,这种混合可视方式可达到较好的多属性交互可视化效果,能够帮助用户了解并分析我国垃圾处理方式的分布及趋势。  相似文献   

3.
Performance visualization uses graphical display techniques to analyze performance data and improve understanding of complex performance phenomena. Current parallel performance visualizations are predominantly two-dimensional. A primary goal of our work is to develop new methods for rapidly prototyping multidimensional performance visualizations. By applying the tools of scientific visualization, we can prototype these next-generation displays for performance visualization-if not implement them as end user tools-using existing software products and graphical techniques that physicists, oceanographers, and meteorologists have used for several years  相似文献   

4.
目的 平行坐标是经典的多维数据可视化方法,但在用于地理空间多维数据分析时,往往存在空间位置信息缺失和空间关联分析不确定等问题。对此,本文设计了一种有效关联平行坐标和地图的地理空间多维数据可视分析方法。方法 根据多维属性信息对地理空间位置进行聚类分析,引入Voronoi图和颜色明暗映射对地理空间各类区域进行显著标识,利用平行坐标呈现地理空间多维属性信息,引入互信息度量地理空间聚类与属性类别的相关性,动态地确定平行坐标轴排列顺序,进一步计算属性轴与地图之间数据线的绑定位置,对数据线的布局进行优化处理,降低地图与平行坐标系间数据线分布的紊乱程度。结果 有效集成上述可视化设计及数据分析方法,设计与实现一种基于平行坐标轴动态排列的地理空间多维数据可视化分析系统,提供便捷的用户交互模式,通过2组具有明显地理空间多维属性特征的数据进行测试,验证了本文可视分析方法的有效性和实用性。结论 本文提出的可视分析方法和工具可以帮助用户快速分析地理空间多维属性存在的空间分布特征及其关联模式,为地理空间多维数据的探索提供了有效手段。  相似文献   

5.
针对微阵列基因表达数据高维小样本、高冗余且高噪声的问题,提出一种基于FCBF特征选择和集成优化学习的分类算法FICS-EKELM。首先使用快速关联过滤方法FCBF滤除部分不相关特征和噪声,找出与类别相关性较高的特征集合;其次,运用抽样技术生成多个样本子集,在每个训练子集上利用改进乌鸦搜索算法同步实现最优特征子集选择和核极限学习机KELM分类器参数优化;然后基于基分类器构建集成分类模型对目标数据进行分类识别;此外运用多核平台多线程并行方式进一步提高算法计算效率。在六组基因数据集上的实验结果表明,本文算法不仅能用较少特征基因达到较优的分类效果,并且分类结果显著高于已有和相似方法,是一种有效的高维数据分类方法。  相似文献   

6.
In this paper, we present Microarray Medical Data explorer (Microarray-MD), a novel software system that is able to assist in the exploratory analysis of gene expression microarray data. It implements a combination scheme of multiple Support Vector Machines, which integrates a variety of gene selection criteria and allows for the discrimination of multiple diseases or subtypes of a disease. The system can be trained and automatically tune its parameters with the provision of pathologically characterized gene expression data to its input. Given a set of new, uncharacterized, patient's data as input, it outputs a decision on the type or the subtype of a disease. A graphical user interface provides easy access to the system operations and direct adjustment of its parameters. It has been tested on various publicly available datasets. The overall accuracy it achieves was estimated to exceed 90%.  相似文献   

7.
cDNA microarrays permit massively parallel gene expression analysis and have spawned a new paradigm in the study of molecular biology. One of the significant challenges in this genomic revolution is to develop sophisticated approaches to facilitate the visualization, analysis, and interpretation of the vast amounts of multi-dimensional gene expression data. We have applied self-organizing map (SOM) in order to meet these challenges. In essence, we utilize U-matrix and component planes in microarray data visualization and introduce general procedure for assessing significance for a cluster detected from U-matrix. Our case studies consist of two data sets. First, we have analyzed a data set containing 13,824 genes in 14 breast cancer cell lines. In the second case we show an example of the SOM in drug treatment of prostate cancer cells. Our results indicate that (1) SOM is capable of helping finding certain biologically meaningful clusters, (2) clustering algorithms could be used for finding a set of potential predictor genes for classification purposes, and (3) comparison and visualization of the effects of different drugs is straightforward with the SOM. In summary, the SOM provides an excellent format for visualization and analysis of gene microarray data, and is likely to facilitate extraction of biologically and medically useful information.  相似文献   

8.
刘芳 《计算机应用研究》2012,29(4):1300-1303
提出了用无监督的自组织映射方法对金融数据进行聚类,并用平行坐标和交互式的圆形平行坐标方法在二维平面上表示出来。用这种方法形成清晰的可视化聚类结果,不仅有效地总结了数据特征,还提高了聚类的可视效果,从而便于发现数据的变化趋势。  相似文献   

9.
基于PCA和平行坐标的高维数据可视化   总被引:1,自引:0,他引:1       下载免费PDF全文
将平行坐标用于高维数据的可视化时,如果要展示的数据维太多,会发生可视化混乱。针对上述问题,提出一种结合主成分分析(PCA)和平行坐标的数据可视化方法PPCP。利用PCA方法对高维数据进行有效的降维处理,将降维后的数据进行平行坐标可视化展示。实验结果证明,该方法能有效地揭示高维数据之间的关系。  相似文献   

10.
提出一种新的多维数据可视化方法,将风暴潮的多维信息有效结合。该数据表示模型,将多维信息按照空间圆柱螺旋线的方式进行排列,提供用户参与选择的交互窗口,并利用空间螺旋线的重复性和独特的旋转特性,将该螺旋线坐标系上的坐标轴投影到二维平面或三维体上,从而实现多维数据的可视化分析。以2006年福建台风“珍珠”的原始观测数据为例进行测试,实验结果表明,该模型能有效地将风暴潮特性数据表示出来,为应急处理等提供直观、及时的信息服务,同时对进一步的分析提供有力帮助。  相似文献   

11.
Cluster analysis of DNA microarray data is an important but difficult task in knowledge discovery processes. Many clustering methods are applied to analysis of data for gene expression, but none of them is able to deal with an absolute way with the challenges that this technology raises. Due to this, many applications have been developed for visually representing clustering algorithm results on DNA microarray data, usually providing dendrogram and heat map visualizations. Most of these applications focus only on the above visualizations, and do not offer further visualization components to the validate the clustering methods or to validate one another. This paper proposes using a visual analytics framework in cluster analysis of gene expression data. Additionally, it presents a new method for finding cluster boundaries based on properties of metric spaces. Our approach presents a set of visualization components able to interact with each other; namely, parallel coordinates, cluster boundary genes, 3D cluster surfaces and DNA microarray visualizations as heat maps. Experimental results have shown that our framework can be very useful in the process of more fully understanding DNA microarray data. The software has been implemented in Java, and the framework is publicly available at http://www.analiticavisual.com/jcastellanos/3DVisualCluster/3D-VisualCluster.  相似文献   

12.
A visualization system for space-time and multivariate patterns (VIS-STAMP)   总被引:4,自引:0,他引:4  
The research reported here integrates computational, visual and cartographic methods to develop a geovisual analytic approach for exploring and understanding spatio-temporal and multivariate patterns. The developed methodology and tools can help analysts investigate complex patterns across multivariate, spatial and temporal dimensions via clustering, sorting and visualization. Specifically, the approach involves a self-organizing map, a parallel coordinate plot, several forms of reorderable matrices (including several ordering methods), a geographic small multiple display and a 2-dimensional cartographic color design method. The coupling among these methods leverages their independent strengths and facilitates a visual exploration of patterns that are difficult to discover otherwise. The visualization system we developed supports overview of complex patterns and through a variety of interactions, enables users to focus on specific patterns and examine detailed views. We demonstrate the system with an application to the IEEE InfoVis 2005 contest data set, which contains time-varying, geographically referenced and multivariate data for technology companies in the US  相似文献   

13.
为了解决多维数据的维数过高、数据量过大带来的平行坐标可视化图形线条密集交叠以及数据规律特征不易获取的问题,提出基于主成分分析和K-means聚类的平行坐标(PCAKP,principal component analysis and k-means clustering parallel coordinate)可视化方法。该方法首先对多维数据采用主成分分析方法进行降维处理,其次对降维后的数据采用K-means聚类处理,最后对聚类得到的数据采用平行坐标可视化技术进行可视化展示。以统计局网站发布的数据为测试数据,对PCAKP可视化方法进行测试,与传统平行坐标可视化图形进行对比,验证了PCAKP可视化方法的实用性和有效性。  相似文献   

14.
With the rapid growth of networked data communications in size and complexity, network administrators today are facing more challenges to protect their networked computers and devices from all kinds of attacks. This paper proposes a new concentric-circle visualization method for visualizing multi-dimensional network data. This method can be used to identify the main features of network attacks, such as DDoS attack, by displaying their recognizable visual patterns. To reduce the edge overlaps and crossings, we arrange multiple axes displayed as concentric circles rather than the traditional parallel lines. In our method, we use polycurves to link values (vertexes) rather than polylines used in parallel coordinate approach. Some heuristics are applied in our new method in order to improve the readability of views. We discuss the advantages as well as the limitations of our new method. In comparison with the parallel coordinate visualization, our approach can reduce more than 15% of the edge overlaps and crossings. In the second stage of the method, we have further enhanced the readability of views by increasing the edge crossing angle. Finally, we introduce our prototype system: a visual interactive network scan detection system called CCScanViewer. It is based on our new visualization approach and the experiments have showed that the new approach is effective in detecting attack features from a variety of networking patterns, such as the features of network scans and DDoS attacks.  相似文献   

15.
To date, work in microarrays, sequenced genomes and bioinformatics has focused largely on algorithmic methods for processing and manipulating vast biological data sets. Future improvements will likely provide users with guidance in selecting the most appropriate algorithms and metrics for identifying meaningful clusters-interesting patterns in large data sets, such as groups of genes with similar profiles. Hierarchical clustering has been shown to be effective in microarray data analysis for identifying genes with similar profiles and thus possibly with similar functions. Users also need an efficient visualization tool, however, to facilitate pattern extraction from microarray data sets. The Hierarchical Clustering Explorer integrates four interactive features to provide information visualization techniques that allow users to control the processes and interact with the results. Thus, hybrid approaches that combine powerful algorithms with interactive visualization tools will join the strengths of fast processors with the detailed understanding of domain experts  相似文献   

16.
基于光线投射算法的混合场景可视化   总被引:3,自引:0,他引:3       下载免费PDF全文
体绘制技术常用于3维体数据场的可视化,其虽然可以生成高质量的投影图像,但通常不能绘制由体数据与点、线、面图形组成的混合场景。在现有的混合场景可视化方法中,有些只能绘制由体数据与面图形组成的复杂混合场景,而不能处理存在点和线的混合场景;有的则成像速度慢、成像质量差。为了能够正确地绘制复杂混合场景,采用SIMD和软件加速等技术,提出了一种速度快、成像质量高的基于光线投射算法的混合场景可视化方法,并分析了该算法所具有的3种绘制次序,以便满足不同应用的要求。该算法既可用于不同场景的绘制,又可用于平行和透视投影中。实验结果表明,该算法能够正确地绘制体数据与点、线、面图形组成的混合场景,且成像速度快,图像质量高。  相似文献   

17.
An insight-based methodology for evaluating bioinformatics visualizations   总被引:1,自引:0,他引:1  
High-throughput experiments, such as gene expression microarrays in the life sciences, result in very large data sets. In response, a wide variety of visualization tools have been created to facilitate data analysis. A primary purpose of these tools is to provide biologically relevant insight into the data. Typically, visualizations are evaluated in controlled studies that measure user performance on predetermined tasks or using heuristics and expert reviews. To evaluate and rank bioinformatics visualizations based on real-world data analysis scenarios, we developed a more relevant evaluation method that focuses on data insight. This paper presents several characteristics of insight that enabled us to recognize and quantify it in open-ended user tests. Using these characteristics, we evaluated five microarray visualization tools on the amount and types of insight they provide and the time it takes to acquire it. The results of the study guide biologists in selecting a visualization tool based on the type of their microarray data, visualization designers on the key role of user interaction techniques, and evaluators on a new approach for evaluating the effectiveness of visualizations for providing insight. Though we used the method to analyze bioinformatics visualizations, it can be applied to other domains.  相似文献   

18.
基因选择是基因表达数据分析中的重点问题.然而现有的方法没有综合考虑样本不平衡和基因间的相互作用。借鉴聚类的验证技术提出了基因选择的0-1规划模型,同时考虑了样本不平衡和基因间的相互作用。进一步根据0-1规划模型的特点,给出了基于贪心思想的启发式算法来求解所提出的优化问题。在3个真实的基因表达数据上对提出的方法进行测试并与两个对照的方法比较,结果表明所提出模型和算法是有效的且稳健的。  相似文献   

19.
Gene selection can help the analysis of microarray gene expression data. However, it is very difficult to obtain a satisfactory classification result by machine learning techniques because of both the curse-of-dimensionality problem and the over-fitting problem. That is, the dimensions of the features are too large but the samples are too few. In this study, we designed an approach that attempts to avoid these two problems and then used it to select a small set of significant biomarker genes for diagnosis. Finally, we attempted to use these markers for the classification of cancer. This approach was tested the approach on a number of microarray datasets in order to demonstrate that it performs well and is both useful and reliable.  相似文献   

20.
Discriminative models are used to analyze the differences between two classes and to identify class-specific patterns. Most of the existing discriminative models depend on using the entire feature space to compute the discriminative patterns for each class. Co-clustering has been proposed to capture the patterns that are correlated in a subset of features, but it cannot handle discriminative patterns in labeled datasets. In certain biological applications such as gene expression analysis, it is critical to consider the discriminative patterns that are correlated only in a subset of the feature space. The objective of this paper is twofold: first, it presents an algorithm to efficiently find arbitrarily positioned co-clusters from complex data. Second, it extends this co-clustering algorithm to discover discriminative co-clusters by incorporating the class information into the co-cluster search process. In addition, we also characterize the discriminative co-clusters and propose three novel measures that can be used to evaluate the performance of any discriminative subspace pattern-mining algorithm. We evaluated the proposed algorithms on several synthetic and real gene expression datasets, and our experimental results showed that the proposed algorithms outperformed several existing algorithms available in the literature.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号