首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Few existing visualization systems can handle large data sets with hundreds of dimensions, since high-dimensional data sets cause clutter on the display and large response time in interactive exploration. In this paper, we present a significantly improved multidimensional visualization approach named Value and Relation (VaR) display that allows users to effectively and efficiently explore large data sets with several hundred dimensions. In the VaR display, data values and dimension relationships are explicitly visualized in the same display by using dimension glyphs to explicitly represent values in dimensions and glyph layout to explicitly convey dimension relationships. In particular, pixel-oriented techniques and density-based scatterplots are used to create dimension glyphs to convey values. Multidimensional scaling, Jigsaw map hierarchy visualization techniques, and an animation metaphor named Rainfall are used to convey relationships among dimensions. A rich set of interaction tools has been provided to allow users to interactively detect patterns of interest in the VaR display. A prototype of the VaR display has been fully implemented. The case studies presented in this paper show how the prototype supports interactive exploration of data sets of several hundred dimensions. A user study evaluating the prototype is also reported in this paper  相似文献   

2.
3.
Dynamical systems are commonly used to describe the state of time-dependent systems. In many engineering and control problems, the state space is high-dimensional making it difficult to analyze and visualize the behavior of the system for varying input conditions. We present a novel dimensionality reduction technique that is tailored to high-dimensional dynamical systems. In contrast to standard general purpose dimensionality reduction algorithms, we use energy minimization to preserve properties of the flow in the high-dimensional space. Once the projection operator is optimized, further high-dimensional trajectories are projected easily. Our 3D projection maintains a number of useful flow properties, such as critical points and flow maps, and is optimized to match geometric characteristics of the high-dimensional input, as well as optional user constraints. We apply our method to trajectories traced in the phase spaces of second-order dynamical systems, including finite-sized objects in fluids, the circular restricted three-body problem and a damped double pendulum. We compare the projections with standard visualization techniques, such as PCA, t-SNE and UMAP, and visualize the dynamical systems with multiple coordinated views interactively, featuring a spatial embedding, projection to subspaces, our dimensionality reduction and a seed point exploration tool.  相似文献   

4.
基于动作的视频交互游戏一直是游戏市场上非常受消费者青睐的主流游戏之一。研究利用Kinect从用户动作中获取低维控制信号,然后通过双级结构来重建高维度运动控制信号,以实现高质量人体动画的实时合成。其中第一级先通过构造一个邻居图缩小搜索空间,再通过K-D树加速搜索算法得到k个相似数据,最后基于主成分分析法来构建一个线性运动实时合成模型;第二级则是利用平滑参数对线性模型进行优化。实验结果表明,即使在场景受到严重干扰的情况下,该方法仍然可以重建出高质量的人体动画。  相似文献   

5.
Parallel coordinate plots (PCPs) are commonly used in information visualization to provide insight into multi-variate data. These plots help to spot correlations between variables. PCPs have been successfully applied to unstructured datasets up to a few millions of points. In this paper, we present techniques to enhance the usability of PCPs for the exploration of large, multi-timepoint volumetric data sets, containing tens of millions of points per timestep. The main difficulties that arise when applying PCPs to large numbers of data points are visual clutter and slow performance, making interactive exploration infeasible. Moreover, the spatial context of the volumetric data is usually lost. We describe techniques for preprocessing using data quantization and compression, and for fast GPU-based rendering of PCPs using joint density distributions for each pair of consecutive variables, resulting in a smooth, continuous visualization. Also, fast brushing techniques are proposed for interactive data selection in multiple linked views, including a 3D spatial volume view. These techniques have been successfully applied to three large data sets: Hurricane Isabel (Vis'04 contest), the ionization front instability data set (Vis'08 design contest), and data from a large-eddy simulation of cumulus clouds. With these data, we show how PCPs can be extended to successfully visualize and interactively explore multi-timepoint volumetric datasets with an order of magnitude more data points.  相似文献   

6.
高茂庭  陆鹏 《计算机应用》2008,28(6):1411-1413
利用遗传算法优化投影方向,投影寻踪模型将高维的文本特征数据投影到2~3维的低维可视化空间上,并根据高维数据在这个低维空间当中的投影特征值来反映其线性和非线性结构或特征,达到数据降维目的并实现文本数据特征可视化。不仅大大约简了文本挖掘过程的计算复杂性,还有助于在K-means聚类算法中确定初始中心点数目,提高算法精度。实验验证了这种方法应用于文本特征降维的有效性。  相似文献   

7.
Knowledge extraction from large amounts of data is an effective approach for analysis and monitoring of industrial processes. The self-organizing map (SOM) is a useful method for this purpose, because it is able to discover low-dimensional structures on high-dimensional spaces and produce a mapping on an ordered low-dimensional space that can be visualized and preserves the most important relationships. With the aim to extract knowledge about the dynamics of industrial processes, we define 2D SOM maps that represent dynamic features which are useful for usual tasks in control engineering such as the analysis of the time response, the coupling among variables, or the difficulties in control of MIMO (multiple-input and multiple-output) systems. Those new maps make it possible to discover, increase or confirm knowledge about the system, spanned through the entire operation range. A well-known quadruple-tank MIMO system was used to test the usefulness of these maps. First, we perform an analysis of the theoretical dynamic behaviors obtained from the physical equations of the system. After that, we carry out an analysis of experimental data from an industrial pilot plant.  相似文献   

8.
曹小鹿  辛云宏 《计算机应用》2017,37(10):2819-2822
降维是大数据分析和可视化领域中的核心问题,其中基于概率分布模型的降维算法通过最优化高维数据模型和低维数据模型之间的代价函数来实现降维。这种策略的核心在于构建最能体现数据特征的概率分布模型。基于此,将Wasserstein距离引入降维,提出一个基于Wasserstein距离概率分布模型的非线性降维算法W-map。W-map模型在高维数据空间和其相关对应的低维数据空间建立相似的Wasserstein流,将降维转化为最小运输问题。在解决Wasserstein距离最小化的问题同时,依据数据的Wasserstein流模型在高维空间与其在低维空间相同的原则,寻找最匹配的低维数据投射。三组针对不同数据集的实验结果表明W-map相对传统概率分布模型可以产生正确性高且鲁棒性好的高维数据降维可视化结果。  相似文献   

9.
随着信息技术的飞速发展和大数据时代的来临,数据呈现出高维性、非线性等复杂特征。对于高维数据来说,在全维空间上往往很难找到反映分布模式的特征区域,而大多数传统聚类算法仅对低维数据具有良好的扩展性。因此,传统聚类算法在处理高维数据的时候,产生的聚类结果可能无法满足现阶段的需求。而子空间聚类算法搜索存在于高维数据子空间中的簇,将数据的原始特征空间分为不同的特征子集,减少不相关特征的影响,保留原数据中的主要特征。通过子空间聚类方法可以发现高维数据中不易展现的信息,并通过可视化技术展现数据属性和维度的内在结构,为高维数据可视分析提供了有效手段。总结了近年来基于子空间聚类的高维数据可视分析方法研究进展,从基于特征选择、基于子空间探索、基于子空间聚类的3种不同方法进行阐述,并对其交互分析方法和应用进行分析,同时对高维数据可视分析方法的未来发展趋势进行了展望。  相似文献   

10.
提出了一种压缩金字塔树,基本思想是,首先将d维数据空间划分为2d个金字塔,由于在低维空间中无效的信息在高维数据空间中往往无效,采用γ划分策略对低维空间中的数据进行压缩,减小索引结构,解决了金字塔技术的缺点,给出了压缩金字塔树的插入、查询、删除算法。最后经实验证明,压缩金字塔树是一种有效的空间划分策略,在高维稀疏空间有良好的性能。  相似文献   

11.
吕兵  王华珍 《计算机应用》2014,34(6):1613-1617
目前对高维数据进行挖掘的方法大多是基于数学理论而非可视化的直觉。为便于直观分析和评价高维数据,提出引入随机森林(RF)方法对高维数据进行数据可视化。首先,采用RF进行有监督学习得到样本间的相似度度量,并采用主坐标分析法对其进行降维,将高维数据的关系信息变换到低维空间;然后,在低维空间中采用散点图进行可视化。在高维基因数据集上实验结果表明,基于RF有监督降维的可视化能够较好地展现高维数据的类分布规律,且优于传统的无监督降维后的可视化效果。  相似文献   

12.
语音信号转换到频域后维数较高,流行学习方法可以自主发现高维数据中潜在低维结构的规律性,提出采用流形学习的方法对高维数据降维来进行汉语数字语音识别。采用流形学习中的局部线性嵌入算法提取语音频域上高维数据的低维流形结构特征,再将低维数据输入动态时间规整识别器进行识别。仿真实验结果表明,采用局部线性嵌入算法的汉语数字语音识别相较于常用声学特征MFCC维数要少,识别率提高了1.2%,有效提高了识别速度。  相似文献   

13.
Mapping high-dimensional data in a low-dimensional space, for example, for visualization, is a problem of increasingly major concern in data analysis. This paper presents data-driven high-dimensional scaling (DD-HDS), a nonlinear mapping method that follows the line of multidimensional scaling (MDS) approach, based on the preservation of distances between pairs of data. It improves the performance of existing competitors with respect to the representation of high-dimensional data, in two ways. It introduces (1) a specific weighting of distances between data taking into account the concentration of measure phenomenon and (2) a symmetric handling of short distances in the original and output spaces, avoiding false neighbor representations while still allowing some necessary tears in the original distribution. More precisely, the weighting is set according to the effective distribution of distances in the data set, with the exception of a single user-defined parameter setting the tradeoff between local neighborhood preservation and global mapping. The optimization of the stress criterion designed for the mapping is realized by "force-directed placement" (FDP). The mappings of low- and high-dimensional data sets are presented as illustrations of the features and advantages of the proposed algorithm. The weighting function specific to high-dimensional data and the symmetric handling of short distances can be easily incorporated in most distance preservation-based nonlinear dimensionality reduction methods.  相似文献   

14.
针对线性降维技术应用于具有非线性结构的数据时无法得到令人满意的结果的问题,提出一种新的着重于保持高维空间局部最近邻信息的非线性随机降维算法(NNSE)。该算法首先在高维空间中通过计算样本点之间的欧氏距离找出每个样本点的最近邻点,接着在低维空间中产生一个随机的初始分布;然后通过将低维空间中的样本点不断向其最近邻点的平均位置移动,直到产生稳定的低维嵌入结果。与一种先进的非线性随机降维算法——t分布随机邻域嵌入(t-SNE)相比,NNSE算法得到的低维结果在可视化方面与t-SNE算法相差不大,但通过比较两者的量化指标可以发现,NNSE算法在保持最近邻信息方面上明显优于t-SNE算法。  相似文献   

15.
几种流形学习算法的比较研究   总被引:1,自引:0,他引:1  
如何发现高维数据空间流形中有意义的低维嵌入信息是流形学习的主要目的。目前,大部分流形学习算法都是用于非线性维数约简或是数据可视化的,如等距映射(Isomap),局部线性嵌入算法(LLE),拉普拉斯特征映射算(laplacian Eigenmap)等等,文章对这三种流形学习算法进行实验分析与比较,目的在于了解这几种流形学习算法的特点,以便更好地进行数据的降维与分析。  相似文献   

16.
王淑娥  孙劲光 《计算机应用》2008,28(10):2565-2568
提出了一种压缩金字塔树,将d维数据空间划分为2d个金字塔,由于在低维空间中无效的信息在高维数据空间中往往无效,采用γ划分策略对低维空间中的数据进行压缩,减小索引结构,克服了金字塔技术的缺点。给出了压缩金字塔树的构造方法以及基于压缩金字塔树的查询算法。实验证明,压缩金字塔树是一种有效的空间划分策略,在高维稀疏空间有良好的性能。  相似文献   

17.
在模式分类问题中,利用Fisher准则及K-L变换将样本数据从高维特征空间映射到低维特征空间以提取特征;而SVM(支持向量机)引进核函数隐含的映射把低维特征空间中的样本数据映射到高维特征空间来实现分类。文章利用三种方法对鸢尾属植物数据集的分类进行仿真试验,并对仿真结果进行分析比较,给出了三种方法在模式分类应用中的异同以及他们之间的内在联系和区别。  相似文献   

18.
Volumetric datasets with multiple variables on each voxel over multiple time steps are often complex, especially when considering the exponentially large attribute space formed by the variables in combination with the spatial and temporal dimensions. It is intuitive, practical, and thus often desirable, to interactively select a subset of the data from within that high-dimensional value space for efficient visualization. This approach is straightforward to implement if the dataset is small enough to be stored entirely in-core. However, to handle datasets sized at hundreds of gigabytes and beyond, this simplistic approach becomes infeasible and thus, more sophisticated solutions are needed. In this work, we developed a system that supports efficient visualization of an arbitrary subset, selected by range-queries, of a large multivariate time-varying dataset. By employing specialized data structures and schemes of data distribution, our system can leverage a large number of networked computers as parallel data servers, and guarantees a near optimal load-balance. We demonstrate our system of scalable data servers using two large time-varying simulation datasets.  相似文献   

19.
Applications in the water treatment domain generally rely on complex sensors located at remote sites. The processing of the corresponding measurements for generating higher-level information such as optimization of coagulation dosing must therefore account for possible sensor failures and imperfect input data. In this paper, self-organizing map (SOM)-based methods are applied to multiparameter data validation and missing data reconstruction in a drinking water treatment. The SOM is a special kind of artificial neural networks that can be used for analysis and visualization of large high-dimensional data sets. It performs both in a nonlinear mapping from a high-dimensional data space to a low-dimensional space aiming to preserve the most important topological and metric relationships of the original data elements and, thus, inherently clusters the data. Combining the SOM results with those obtained by a fuzzy technique that uses marginal adequacy concept to identify the functional states (normal or abnormal), the SOM performances of validation and reconstruction process are tested successfully on the experimental data stemming from a coagulation process involved in drinking water treatment.  相似文献   

20.

Effective dimension, an indicator for the difficulty of high-dimensional integration, describes whether a function can be well approximated by low-dimensional terms or sums of low-order terms. Some problems in option pricing are believed to have low effective dimensions, which help explain the success of quasi-Monte Carlo (QMC) methods recently observed in financial engineering. This paper provides a way of studying the structure of effective dimensions by finding a proper space the function of interest belongs to and then determining the effective dimension of that space. To this end, we extend the definitions of effective dimensions to weighted function spaces with product-order-dependent weights and give bounds on norms and variances. Furthermore, we show that the proposed method is applicable to functions arising in option pricing and consequently offers some hints on the performance of QMC methods.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号