首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Recently, we have developed the hierarchical generative topographic mapping (HGTM), an interactive method for visualization of large high-dimensional real-valued data sets. We propose a more general visualization system by extending HGTM in three ways, which allows the user to visualize a wider range of data sets and better support the model development process. 1) We integrate HGTM with noise models from the exponential family of distributions. The basic building block is the latent trait model (LTM). This enables us to visualize data of inherently discrete nature, e.g., collections of documents, in a hierarchical manner. 2) We give the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode, the user selects "regions of interest", whereas in the automatic mode, an unsupervised minimum message length (MML)-inspired construction of a mixture of LTMs is employed. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. 3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualization plots, since they can highlight the boundaries between data clusters. We illustrate our approach on a toy example and evaluate it on three more complex real data sets.  相似文献   

2.
While it is quite typical to deal with attributes of different data types in the visualization of heterogeneous and multivariate datasets, most existing techniques still focus on the most usual data types such as numerical attributes or strings. In this paper we present a new approach to the interactive visual exploration and analysis of data that contains attributes which are of set type. A set-typed attribute of a data item--like one cell in a table--has a list of n > or = 0 elements as its value. We present the set'o'gram as a new visualization approach to represent data of set type and to enable interactive visual exploration and analysis. We also demonstrate how this approach is capable to help in dealing with datasets that have a larger number of dimensions (more than a dozen or more), especially also in the context of categorical data. To illustrate the effectiveness of our approach, we present the interactive visual analysis of a CRM dataset with data from a questionnaire on the education and shopping habits of about 90000 people.  相似文献   

3.
This paper discusses 3D visualization and interactive exploration of large relational data sets through the integration of several well-chosen multidimensional data visualization techniques and for the purpose of visual data mining and exploratory data analysis. The basic idea is to combine the techniques of grand tour, direct volume rendering, and data aggregation in databases to deal with both the high dimensionality of data and a large number of relational records. Each technique has been enhanced or modified for this application. Specifically, positions of data clusters are used to decide the path of a grand tour. This cluster-guided tour makes intercluster-distance-preserving projections in which data clusters are displayed as separate as possible. A tetrahedral mapping method applied to cluster centroids helps in choosing interesting cluster-guided projections. Multidimensional footprint splatting is used to directly render large relational data sets. This approach abandons the rendering techniques that enhance 3D realism and focuses on how to efficiently produce real-time explanatory images that give comprehensive insights into global features such as data clusters and holes. Examples are given where the techniques are applied to large (more than a million records) relational data sets.  相似文献   

4.
As heterogeneous data from different sources are being increasingly linked, it becomes difficult for users to understand how the data are connected, to identify what means are suitable to analyze a given data set, or to find out how to proceed for a given analysis task. We target this challenge with a new model-driven design process that effectively codesigns aspects of data, view, analytics, and tasks. We achieve this by using the workflow of the analysis task as a trajectory through data, interactive views, and analytical processes. The benefits for the analysis session go well beyond the pure selection of appropriate data sets and range from providing orientation or even guidance along a preferred analysis path to a potential overall speedup, allowing data to be fetched ahead of time. We illustrate the design process for a biomedical use case that aims at determining a treatment plan for cancer patients from the visual analysis of a large, heterogeneous clinical data pool. As an example for how to apply the comprehensive design approach, we present Stack'n'flip, a sample implementation which tightly integrates visualizations of the actual data with a map of available data sets, views, and tasks, thus capturing and communicating the analytical workflow through the required data sets.  相似文献   

5.
可视化技术通过图形表现数据的内在规律,并可利用交互的形式实现数据的层次化展示,其在分析交通数据、发现交通问题以及辅助决策中扮演着越来越重要的角色。为了更加清晰、直观地展示城市出租车GPS轨迹数据传递的信息,解决因其数据量庞大和时空信息复杂而带来的分析难题,提出一种集成聚集可视化、特征可视化对出租车GPS轨迹数据进行可视化分析的方法。首先,通过数据处理得到可用于可视化的特征数据,而后对乘客上下车点进行聚集可视化,并利用多视图协同交互的方法对轨迹数据进行了特征可视化;最后,根据可视化结果对城市出租车乘客出行特征时空分布情况进行了分析。在此基础上,设计了一个交互式可视分析系统,并通过真实数据集案例验证了系统的有效性。  相似文献   

6.
Dimensionality reducing mappings, often also denoted as multidimensional scaling, are the basis for multivariate data projection and visual analysis in data mining. Topology and distance preserving mapping techniques-e.g., Kohonen's self-organizing feature map (SOM) or Sammon's nonlinear mapping (NLM)-are available to achieve multivariate data projections for the following interactive visual analysis process. For large data bases, however, NLM computation becomes intractable. Also, if additional data points or data sets are to be included in the projection, a complete recomputation of the mapping is required. In general, a neural network could learn the mapping and serve for arbitrary additional data projection. However, the computational costs would also be high, and convergence is not easily achieved. In this work, a convenient hierarchical neural projection approach is introduced, where first an unsupervised neural network-e.g., a SOM-quantizes the data base, followed by fast NLM mapping of the quantized data. In the second stage of the hierarchy, an enhancement of the NLM by a recall algorithm is applied. The training and application of a second neural network, which is learning the mapping by function approximation, is quantitatively compared with this new approach. Efficient interactive visualization and analysis techniques, exploiting the achieved hierarchical neural projection for data mining, are presented.  相似文献   

7.
大数据可视化的挑战与最新进展   总被引:2,自引:0,他引:2  
崔迪  郭小燕  陈为 《计算机应用》2017,37(7):2044-2049
大数据的来临增强了可视化的重要性。可视化分析挖掘人类对于信息的认知能力与优势,将人、机有机融合,借助人机交互高效洞悉大数据背后的信息与规律,是大数据分析的重要方法。针对大数据数据量大、维度高、多来源、多形态等特点论述了大规模数据、流数据、非结构和异构数据的可视化方法。首先讨论了大规模数据的可视化技术:1)采用分而治之的原则将大问题分解成较小的任务并采用并行处理的方式解决以提高处理的速度;2)通过聚合、采样、多分辨表示的方法进行数据约简;3)针对高维数据选择若干个视图,在多个角度下生成不同的可视化结果。然后针对监控型、叠加型两类流数据探讨了流数据的可视化过程。最后阐述了非结构化数据以及异构性数据的可视化技术。总之,可视化能够克服计算机自动化分析方法的劣势与不足,整合计算机的分析能力和人们对信息的感知能力,有效地洞悉大数据背后的信息与智慧,但其理论研究成果也非常有限,同时面临着数据规模大、动态变化、维度高、多源异构等方面的挑战,这些也逐渐成为今后的大数据可视化研究的热点与方向。  相似文献   

8.
We present an approach to visualizing particle-based simulation data using interactive ray tracing and describe an algorithmic enhancement that exploits the properties of these data sets to provide highly interactive performance and reduced storage requirements. This algorithm for fast packet-based ray tracing of multilevel grids enables the interactive visualization of large time-varying data sets with millions of particles and incorporates advanced features like soft shadows. We compare the performance of our approach with two recent particle visualization systems: one based on an optimized single ray grid traversal algorithm and the other on programmable graphics hardware. This comparison demonstrates that the new algorithm offers an attractive alternative for interactive particle visualization.  相似文献   

9.
With recent advances in the measurement technology for allsky astrophysical imaging, our view of the sky is no longer limited to the tiny visible spectral range over the 2D Celestial sphere. We now can access a third dimension corresponding to a broad electromagnetic spectrum with a wide range of allsky surveys; these surveys span frequency bands including long wavelength radio, microwaves, very short X-rays, and gamma rays. These advances motivate us to study and examine multiwavelength visualization techniques to maximize our capabilities to visualize and exploit these informative image data sets. In this work, we begin with the processing of the data themselves, uniformizing the representations and units of raw data obtained from varied detector sources. Then we apply tools to map, convert, color-code, and format the multiwavelength data in forms useful for applications. We explore different visual representations for displaying the data, including such methods as textured image stacks, the horseshoe representation, and GPU-based volume visualization. A family of visual tools and analysis methods is introduced to explore the data, including interactive data mapping on the graphics processing unit (GPU), the mini-map explorer, and GPU-based interactive feature analysis.  相似文献   

10.
In today’s knowledge-, service-, and cloud-based economy, businesses accumulate massive amounts of data from a variety of sources. In order to understand businesses one may need to perform considerable analytics over large hybrid collections of heterogeneous and partially unstructured data that is captured related to the process execution. This data, usually modeled as graphs, increasingly come to show all the typical properties of big data: wide physical distribution, diversity of formats, non-standard data models, independently-managed and heterogeneous semantics. We use the term big process graph to refer to such large hybrid collections of heterogeneous and partially unstructured process related execution data. Online analytical processing (OLAP) of big process graph is challenging as the extension of existing OLAP techniques to analysis of graphs is not straightforward. Moreover, process data analysis methods should be capable of processing and querying large amount of data effectively and efficiently, and therefore have to be able to scale well with the infrastructure’s scale. While traditional analytics solutions (relational DBs, data warehouses and OLAP), do a great job in collecting data and providing answers on known questions, key business insights remain hidden in the interactions among objects: it will be hard to discover concept hierarchies for entities based on both data objects and their interactions in process graphs. In this paper, we introduce a framework and a set of methods to support scalable graph-based OLAP analytics over process execution data. The goal is to facilitate the analytics over big process graph through summarizing the process graph and providing multiple views at different granularity. To achieve this goal, we present a model for process OLAP (P-OLAP) and define OLAP specific abstractions in process context such as process cubes, dimensions, and cells. We present a MapReduce-based graph processing engine, to support big data analytics over process graphs. We have implemented the P-OLAP framework and integrated it into our existing process data analytics platform, ProcessAtlas, which introduces a scalable architecture for querying, exploration and analysis of large process data. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.  相似文献   

11.
Network communication has become indispensable in business, education and government. With the pervasive role of the Internet as a means of sharing information across networks, its misuse for destructive purposes, such as spreading malicious code, compromising remote hosts, or damaging data through unauthorized access, has grown immensely in the recent years. The classical way of monitoring the operation of large network systems is by analyzing the system logs for detecting anomalies. In this work, we introduce hierarchical network map, an interactive visualization technique for gaining a deeper insight into network flow behavior by means of user-driven visual exploration. Our approach is meant as an enhancement to conventional analysis methods based on statistics or machine learning. We use multidimensional modeling combined with position and display awareness to view source and target data of the hosts in a hierarchical fashion with the ability to interactively change the level of aggregation or apply filtering. The interdisciplinary approach integrating data warehouse technology, information visualization and decision support brings about the benefit of efficiently collecting the input data and aggregating over very large data sets, visualizing the results and providing interactivity to facilitate analytical reasoning  相似文献   

12.
We take a new approach to interactive visualization and feature detection of large scalar, vector, and multifield computational fluid dynamics data sets that is also well suited for meshless CFD methods. Radial basis functions (RBFs) are used to procedurally encode both scattered and irregular gridded scalar data sets. The RBF encoding creates a complete, unified, functional representation of the scalar field throughout 3D space, independent of the underlying data topology, and eliminates the need for the original data grid during visualization. The capability of commodity PC graphics hardware to accelerate the reconstruction and rendering and to perform feature detection from this functional representation is a powerful tool for visualizing procedurally encoded volumes. Our RBF encoding and GPU-accelerated reconstruction, feature detection, and visualization tool provides a flexible system for visually exploring and analyzing large, structured, scattered, and unstructured scalar, vector, and multifield data sets at interactive rates on desktop PCs.  相似文献   

13.
Few existing visualization systems can handle large data sets with hundreds of dimensions, since high-dimensional data sets cause clutter on the display and large response time in interactive exploration. In this paper, we present a significantly improved multidimensional visualization approach named Value and Relation (VaR) display that allows users to effectively and efficiently explore large data sets with several hundred dimensions. In the VaR display, data values and dimension relationships are explicitly visualized in the same display by using dimension glyphs to explicitly represent values in dimensions and glyph layout to explicitly convey dimension relationships. In particular, pixel-oriented techniques and density-based scatterplots are used to create dimension glyphs to convey values. Multidimensional scaling, Jigsaw map hierarchy visualization techniques, and an animation metaphor named Rainfall are used to convey relationships among dimensions. A rich set of interaction tools has been provided to allow users to interactively detect patterns of interest in the VaR display. A prototype of the VaR display has been fully implemented. The case studies presented in this paper show how the prototype supports interactive exploration of data sets of several hundred dimensions. A user study evaluating the prototype is also reported in this paper  相似文献   

14.
Visual data mining in large geospatial point sets   总被引:2,自引:0,他引:2  
Visual data-mining techniques have proven valuable in exploratory data analysis, and they have strong potential in the exploration of large databases. Detecting interesting local patterns in large data sets is a key research challenge. Particularly challenging today is finding and deploying efficient and scalable visualization strategies for exploring large geospatial data sets. One way is to share ideas from the statistics and machine-learning disciplines with ideas and methods from the information and geo-visualization disciplines. PixelMaps in the Waldo system demonstrates how data mining can be successfully integrated with interactive visualization. The increasing scale and complexity of data analysis problems require tighter integration of interactive geospatial data visualization with statistical data-mining algorithms.  相似文献   

15.
In volume data visualization, the classification step is used to determine voxel visibility and is usually carried out through the interactive editing of a transfer function that defines a mapping between voxel value and color/opacity. This approach is limited by the difficulties in working effectively in the transfer function space beyond two dimensions. We present a new approach to the volume classification problem which couples machine learning and a painting metaphor to allow more sophisticated classification in an intuitive manner. The user works in the volume data space by directly painting on sample slices of the volume and the painted voxels are used in an iterative training process. The trained system can then classify the entire volume. Both classification and rendering can be hardware accelerated, providing immediate visual feedback as painting progresses. Such an intelligent system approach enables the user to perform classification in a much higher dimensional space without explicitly specifying the mapping for every dimension used. Furthermore, the trained system for one data set may be reused to classify other data sets with similar characteristics.  相似文献   

16.
This paper describes a new out-of-core multi-resolution data structure for real-time visualization, interactive editing and externally efficient processing of large point clouds. We describe an editing system that makes use of the novel data structure to provide interactive editing and preprocessing tools for large scanner data sets. Using the new data structure, we provide a complete tool chain for 3D scanner data processing, from data preprocessing and filtering to manual touch-up and real-time visualization. In particular, we describe an out-of-core outlier removal and bilateral geometry filtering algorithm, a toolset for interactive selection, painting, transformation, and filtering of huge out-of-core point-cloud data sets and a real-time rendering algorithm, which all use the same data structure as storage backend. The interactive tools work in real-time for small model modifications. For large scale editing operations, we employ a two-resolution approach where editing is planned in real-time and executed in an externally efficient offline computation afterwards. We evaluate our implementation on example data sets of sizes up to 63 GB, demonstrating that the proposed technique can be used effectively in real-world applications.  相似文献   

17.
A particle system for interactive visualization of 3D flows   总被引:3,自引:0,他引:3  
We present a particle system for interactive visualization of steady 3D flow fields on uniform grids. For the amount of particles we target, particle integration needs to be accelerated and the transfer of these sets for rendering must be avoided. To fulfill these requirements, we exploit features of recent graphics accelerators to advect particles in the graphics processing unit (GPU), saving particle positions in graphics memory, and then sending these positions through the GPU again to obtain images in the frame buffer. This approach allows for interactive streaming and rendering of millions of particles and it enables virtual exploration of high resolution fields in a way similar to real-world experiments. The ability to display the dynamics of large particle sets using visualization options like shaded points or oriented texture splats provides an effective means for visual flow analysis that is far beyond existing solutions. For each particle, flow quantities like vorticity magnitude and A2 are computed and displayed. Built upon a previously published GPU implementation of a sorting network, visibility sorting of transparent particles is implemented. To provide additional visual cues, the GPU constructs and displays visualization geometry like particle lines and stream ribbons.  相似文献   

18.
Hibbard  W. Santek  D. 《Computer》1989,22(8):53-57
The authors describe the capabilities of McIDAS , an interactive visualization system that is vastly increasing the ability of earth scientists to manage and analyze data from remote sensing instruments and numerical simulation models. McIDAS provides animated three-dimensional images and highly interactive displays. The software can manage, analyze, and visualize large data sets that span many physical variables (such as temperature, pressure, humidity, and wind speed), as well as time and three spatial dimensions. The McIDAS system manages data from at least 100 different sources. The data management tools consist of data structures for storing different data types in files, libraries of routines for accessing these data structures, system commands for performing housekeeping functions on the data files, and reformatting programs for converting external data to the system's data structures. The McIDAS tools for three-dimensional visualization of meteorological data run on an IBM mainframe and can load up to 128-frame animation sequences into the workstations. A highly interactive version of the system can provide an interactive window into data sets containing tens of millions of points produced by numerical models and remote sensing instruments. The visualizations are being used for teaching as well as by scientists  相似文献   

19.
Categorical data dimensions appear in many real-world data sets, but few visualization methods exist that properly deal with them. Parallel Sets are a new method for the visualization and interactive exploration of categorical data that shows data frequencies instead of the individual data points. The method is based on the axis layout of parallel coordinates, with boxes representing the categories and parallelograms between the axes showing the relations between categories. In addition to the visual representation, we designed a rich set of interactions. Parallel Sets allow the user to interactively remap the data to new categorizations and, thus, to consider more data dimensions during exploration and analysis than usually possible. At the same time, a metalevel, semantic representation of the data is built. Common procedures, like building the cross product of two or more dimensions, can be performed automatically, thus complementing the interactive visualization. We demonstrate Parallel Sets by analyzing a large CRM data set, as well as investigating housing data from two US states.  相似文献   

20.
From visual data exploration to visual data mining: a survey   总被引:8,自引:0,他引:8  
We survey work on the different uses of graphical mapping and interaction techniques for visual data mining of large data sets represented as table data. Basic terminology related to data mining, data sets, and visualization is introduced. Previous work on information visualization is reviewed in light of different categorizations of techniques and systems. The role of interaction techniques is discussed, in addition to work addressing the question of selecting and evaluating visualization techniques. We review some representative work on the use of information visualization techniques in the context of mining data. This includes both visual data exploration and visually expressing the outcome of specific mining algorithms. We also review recent innovative approaches that attempt to integrate visualization into the DM/KDD process, using it to enhance user interaction and comprehension.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号