首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
可视化技术对于分析和探究大规模的多维数据集变得越来越重要,其中最重要的一种可视化技术是一种面向像素的可视化技术,其基本原理是将数据集中的每个数据值映射成屏幕上的一个像素并对这些像素按一定的规则充分地加以排列,以便将尽可能多的数据对象以人们熟悉的图形图像展现在屏幕上。递归模式技术是面向像素的可视化技术的一种,它基于简单地来回排列,允许用户参与定义结构和设置参数,主要适用于有自然顺序的数据集。在股票数据分析中,利用递归模式技术比较容易描述交易数据库中股票价格的变化情况,并预测股票的走势。  相似文献   

3.
For classifying large data sets, we propose a discriminant kernel that introduces a nonlinear mapping from the joint space of input data and output label to a discriminant space. Our method differs from traditional ones, which correspond to map nonlinearly from the input space to a feature space. The induced distance of our discriminant kernel is Eu- clidean and Fisher separable, as it is defined based on distance vectors of the feature space to distance vectors on the discriminant space. Unlike the support vector machines or the kernel Fisher discriminant analysis, the classifier does not need to solve a quadric program- ming problem or eigen-decomposition problems. Therefore, it is especially appropriate to the problems of processing large data sets. The classifier can be applied to face recognition, shape comparison and image classification benchmark data sets. The method is significantly faster than other methods and yet it can deliver comparable classification accuracy.  相似文献   

4.
We introduce multivalue data as a new data type in the context of scientific visualization. While this data type has existed in other fields, the visualization community has largely ignored it. Formally, a multivalue datum is a collection of values about a single variable. Multivalue data sets can be defined for multiple dimensions. A spatial multivalue data set consists of a multivalue datum at each physical location in the domain. The time dimension is equally valid. This leads to spatio-temporal multivalue data sets where there is time varying, multidimensional data with a multivalue datum at each location and time. The spatial multivalue data type captures multiple instances of the same variable at each location in space. Visualizing spatial multivalue data sets is a new challenge.  相似文献   

5.
We present a new model for creating composite visualizations of multidimensional data sets using simple visual representations such as point charts, scatterplots and parallel coordinates as components. Each visual representation is contained in a tile, and the tiles are arranged in a mosaic of views using a space‐filling slice‐and‐dice layout. Tiles can be created, resized, split or merged using a versatile set of interaction techniques, and the visual representation of individual tiles can also be dynamically changed to another representation. Because each tile is self‐contained and independent, it can be implemented in any programming language, on any platform and using any visual representation. We also propose a formalism for expressing visualization mosaics. A Web‐based implementation called MosaicJS supporting multidimensional visual exploration showcases the versatility of the concept and illustrates how it can be used to integrate visualization components provided by different toolkits.  相似文献   

6.
The naive Bayes model has proven to be a simple yet effective model, which is very popular for pattern recognition applications such as data classification and clustering. This paper explores the possibility of using this model for multidimensional data visualization. To achieve this, a new learning algorithm called naive Bayes self-organizing map (NBSOM) is proposed to enable the naive Bayes model to perform topographic mappings. The training is carried out by means of an online expectation maximization algorithm with a self-organizing principle. The proposed method is compared with principal component analysis, self-organizing maps, and generative topographic mapping on two benchmark data sets and a real-world image processing application. Overall, the results show the effectiveness of NBSOM for multidimensional data visualization.  相似文献   

7.
This paper presents a gesture recognition system for visualization navigation. Scientists are interested in developing interactive settings for exploring large data sets in an intuitive environment. The input consists of registered 3-D data. A geometric method using Bezier curves is used for the trajectory analysis and classification of gestures. The hand gesture speed is incorporated into the algorithm to enable correct recognition from trajectories with variations in hand speed. The method is robust and reliable: correct hand identification rate is 99.9% (from 1641 frames), modes of hand movements are correct 95.6% of the time, recognition rate (given the right mode) is 97.9%. An application to gesture-controlled visualization of 3D bioinformatics data is also presented.  相似文献   

8.
基于样本选择的最近邻凸包分类器   总被引:1,自引:0,他引:1       下载免费PDF全文
最近邻凸包分类算法是一种以测试点到各类别样本凸包的距离为分类度量的最近邻分类算法。然而,该算法的凸二次规划问题优化求解的较高的计算复杂度限制了其在较大规模数据集上的应用。本文提出一种样本选择方法——子类凸包生长法。通过迭代,选择距离选出样本凸包最远的点,直到满足终止条件,从而实现数据集的有效约简。ORL数据库和MIT-CBCL人脸识别training-synthetic库上的实验结果表明,子类凸包生长法选出的少量样本生成的凸包能够很好的表征训练集,在不降低最近邻凸包分类器性能的同时,使得算法的计算速度大为提高。  相似文献   

9.
Visualization techniques are of increasing importance in exploring and analyzing large amounts of multidimensional information. One important class of visualization techniques which is particularly interesting for visualizing very large multidimensional data sets is the class of pixel-oriented techniques. The basic idea of pixel-oriented visualization techniques is to represent as many data objects as possible on the screen at the same time by mapping each data value to a pixel of the screen and arranging the pixels adequately. A number of different pixel-oriented visualization techniques have been proposed in recent years and it has been shown that the techniques are useful for visual data exploration in a number of different application contexts. In this paper, we discuss a number of issues which are important in developing pixel-oriented visualization techniques. The major goal of this article is to provide a formal basis of pixel-oriented visualization techniques and show that the design decisions in developing them can be seen as solutions of well-defined optimization problems. This is true for the mapping of the data values to colors, the arrangement of pixels inside the subwindows, the shape of the subwindows, and the ordering of the dimension subwindows. The paper also discusses the design issues of special variants of pixel-oriented techniques for visualizing large spatial data sets  相似文献   

10.
Multidimensional data sets are common in many domains, and dimensionality reduction methods that determine a lower dimensional embedding are widely used for visualizing such data sets. This paper presents a novel method to project data onto a lower dimensional space by taking into account the order statistics of the individual data points, which are quantified by their depth or centrality in the overall set. Thus, in addition to conveying relative distances in the data, the proposed method also preserves the order statistics, which are often lost or misrepresented by existing visualization methods. The proposed method entails a modification of the optimization objective of conventional multidimensional scaling (MDS) by introducing a term that penalizes discrepancies between centrality structures in the original space and the embedding. We also introduce two strategies for visualizing lower dimensional embeddings of multidimensional data that takes advantage of the coherent representation of centrality provided by the proposed projection method. We demonstrate the effectiveness of our visualization with comparisons on different kinds of multidimensional data, including categorical and multimodal, from a variety of domains such as botany and health care.  相似文献   

11.
Woodward  P.R. 《Computer》1993,26(10):13-25
Examples of scientific visualization techniques used for the interactive exploration of very large data sets from supercomputer simulations of fluid flow are presented. Interactive rendering of images from simulations of grids of 2 million or more computational zones are required to drive high-end graphics workstations to their limits with 2-D data. The author presents one such image and discusses interactive steering of 2-D flow simulations, a phenomenon now possible with grids of half a million computational zones. He uses a simulation of compressible turbulence on a grid of 134 million computational zones to set the scale for discussing interactive 3-D visualization techniques. A concept for a gigapixel-per-second video wall, or gigawall, which could be built with present technology to meet the demands of interactive visualization of the data sets that will be produced by the next generation of supercomputers, is discussed  相似文献   

12.
基于子空间样本选择的最近凸包分类器   总被引:3,自引:0,他引:3       下载免费PDF全文
最近邻凸包分类器需要求解测试样本到训练集凸包距离的凸二次规划问题,对于训练集规模较大的情况,有必要在分类之前进行适当的样本选择。为此该文提出基于子空间样本选择的最近凸包分类方法。该方法首先采用子空间样本选择算法对训练集样本进行筛选,然后将各类选出的样本作为最近邻分类器的新的训练集。子空间样本选择方法的原理是在一类训练样本集内,迭代选择距离已选样本张成子空间最远的样本。在MIT-CBCL人脸识别数据库的training-synthetic子库的实验中,该方法只需5.6%的训练样本即可取得100%的识别率,并且执行时间较未经选样的最近邻凸包分类器也大为减少。  相似文献   

13.
The problem of data visualization in the analysis of two classes in a multidimensional feature space is considered. The two orthogonal axes by which the classes are maximally separated from each other are found in the mapping of classes as a result of linear transformation of coordinates. The proximity of the classes is estimated based on the minimum-distance criterion between their convex hulls. This criterion makes it possible to show cases of full class separability and random outliers. A support vector machine is used to obtain orthogonal vectors of the reduced space. This method ensures the obtaining of the weight vector that determines the minimum distance between the convex hulls of classes for linearly separable classes. Algorithms with reduction, contraction, and offset of convex hulls are used for intersecting classes. Experimental studies are devoted to the application of the considered visualization methods to biomedical data analysis.  相似文献   

14.
Testing for uniformity in multidimensional data   总被引:1,自引:0,他引:1  
Testing for uniformity in multidimensional data is important in exploratory pattern analysis, statistical pattern recognition, and image processing. The goal of this paper is to determine whether the data follow the uniform distribution over some compact convex set in K-dimensional space, called the sampling window. We first provide a simple, computationally efficient method for generating a uniformly distributed sample over a set which approximates the convex hul of the data. We then test for uniformity by comparing this generated sample to the data by using Friedman-Rafsky's minimal spanning tree (MST) based test. Experiments with both simulated and real data indicate that this MST-based test is useful in deciding if data are uniform.  相似文献   

15.
在一个给定的样本空间划分下,每个数据集是一个潜在的多项分布的抽样假设。通过对模型参数的最大似然估计,数据集的潜在分布近似于一个离散化的经验分布。根据推广的多项分布族的Fisher度量,潜在分布的信息差异可近似为经验分布间的差异,为基于MLE嵌入得到的信息流形上非监督学习创造了条件。当约简空间的维数为2或3时,原数据集之间的自然可分性可通过降维数据展现出来。实验结果表明,该方法能应用到大样本数据集或彩色图像等高维结构化数据的可视化。  相似文献   

16.
ZedGraph是一个开源的控件,提供了用户控件和web控件。它可以创建2D的线性图、条形图和饼图。介绍了ZedGraph的主要类和多维数据,论述了如何将ZedGraph控件应用到多维数据图形显示中,利用ZedGraph可以简单、方便地实现了多维数据的可视化。  相似文献   

17.
New asymptotic methods are introduced that permit computationally simple Bayesian recognition and parameter estimation for many large data sets described by a combination of algebraic, geometric, and probabilistic models. The techniques introduced permit controlled decomposition of a large problem into small problems for separate parallel processing where maximum likelihood estimation or Bayesian estimation or recognition can be realized locally. These results can be combined to arrive at globally optimum estimation or recognition. The approach is applied to the maximum likelihood estimation of 3-D complex-object position. To this end, the surface of an object is modeled as a collection of patches of primitive quadrics, i.e., planar, cylindrical, and spherical patches, possibly augmented by boundary segments. The primitive surface-patch models are specified by geometric parameters, reflecting location, orientation, and dimension information. The object-position estimation is based on sets of range data points, each set associated with an object primitive. Probability density functions are introduced that model the generation of range measurement points. This entails the formulation of a noise mechanism in three-space accounting for inaccuracies in the 3-D measurements and possibly for inaccuracies in the 3-D modeling. We develop the necessary techniques for optimal local parameter estimation and primitive boundary or surface type recognition for each small patch of data, and then optimal combining of these inaccurate locally derived parameter estimates in order to arrive at roughly globally optimum object-position estimation.  相似文献   

18.
In this paper, a new algorithm named polar self-organizing map (PolSOM) is proposed. PolSOM is constructed on a 2-D polar map with two variables, radius and angle, which represent data weight and feature, respectively. Compared with the traditional algorithms projecting data on a Cartesian map by using the Euclidian distance as the only variable, PolSOM not only preserves the data topology and the inter-neuron distance, it also visualizes the differences among clusters in terms of weight and feature. In PolSOM, the visualization map is divided into tori and circular sectors by radial and angular coordinates, and neurons are set on the boundary intersections of circular sectors and tori as benchmarks to attract the data with the similar attributes. Every datum is projected on the map with the polar coordinates which are trained towards the winning neuron. As a result, similar data group together, and data characteristics are reflected by their positions on the map. The simulations and comparisons with Sammon's mapping, SOM and ViSOM are provided based on four data sets. The results demonstrate the effectiveness of the PolSOM algorithm for multidimensional data visualization.  相似文献   

19.
Techniques for multidimensional scaling visualize objects as points in a low-dimensional metric map. As a result, the visualizations are subject to the fundamental limitations of metric spaces. These limitations prevent multidimensional scaling from faithfully representing non-metric similarity data such as word associations or event co-occurrences. In particular, multidimensional scaling cannot faithfully represent intransitive pairwise similarities in a visualization, and it cannot faithfully visualize “central” objects. In this paper, we present an extension of a recently proposed multidimensional scaling technique called t-SNE. The extension aims to address the problems of traditional multidimensional scaling techniques when these techniques are used to visualize non-metric similarities. The new technique, called multiple maps t-SNE, alleviates these problems by constructing a collection of maps that reveal complementary structure in the similarity data. We apply multiple maps t-SNE to a large data set of word association data and to a data set of NIPS co-authorships, demonstrating its ability to successfully visualize non-metric similarities.  相似文献   

20.
数据可视化通常是展示数据价值最有效的方式。针对大规模复杂多维数据,对相关数据子集进行分析并将分析结果自动映射成合适的可视化展现模式,是一项需要大量迭代计算的复杂技术工作。设计并实现了DRVisSys系统,该系统根据属性关联分析技术推荐出合适的可视化展现模式;其对于非平凡属性组合的选择,采用典型关联算法计算出更优的属性集。考虑到各属性权重在实际生活中是有区别的,采用层叠隐马尔可夫算法计算各属性权重,将属性权重作为非平凡属性组的评测标准之一。为使得推荐出的可视化展现模式能更好地满足用户需要,DRVisSys系统能根据用户反馈,更新可视化推荐模型。实验结果表明,DRVisSys能够快速进行数据分析并为用户推荐出合适的可视化展现模式。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号