首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Uncertainty is ubiquitous in science, engineering and medicine. Drawing conclusions from uncertain data is the normal case, not an exception. While the field of statistical graphics is well established, only a few 2D and 3D visualization and feature extraction methods have been devised that consider uncertainty. We present mathematical formulations for uncertain equivalents of isocontours based on standard probability theory and statistics and employ them in interactive visualization methods. As input data, we consider discretized uncertain scalar fields and model these as random fields. To create a continuous representation suitable for visualization we introduce interpolated probability density functions. Furthermore, we introduce numerical condition as a general means in feature-based visualization. The condition number-which potentially diverges in the isocontour problem-describes how errors in the input data are amplified in feature computation. We show how the average numerical condition of isocontours aids the selection of thresholds that correspond to robust isocontours. Additionally, we introduce the isocontour density and the level crossing probability field; these two measures for the spatial distribution of uncertain isocontours are directly based on the probabilistic model of the input data. Finally, we adapt interactive visualization methods to evaluate and display these measures and apply them to 2D and 3D data sets.  相似文献   

2.
In this paper we consider the problem of allocating personal TV advertisements to viewers. The problem’s input consists of ad requests and viewers. Each ad is associated with a length, a payment, a requested number of viewers, a requested number of allocations per viewer and a target population profile. Each viewer is associated with a profile and an estimated viewing capacity which is uncertain. The goal is to maximize the revenue obtained from the allocation of ads to viewers for multiple periods while satisfying the ad constraints. First, we present the integer programming (IP) models of the problem and several heuristics for the deterministic version of the problem where the viewers’ viewing capacities are known in advance. We compare the performances of the proposed algorithms to those of the state-of-the-art IP solver. Later, we discuss the multi-period uncertain problem and, based on the best heuristic for the deterministic version, present heuristics for low and high uncertainty. Through computational experiments, we evaluate our heuristics. For the deterministic version, our best heuristic attains 98 % of the possible revenue and for the multi-period uncertain version our heuristics performances are very high, even in cases of high uncertainty, compared to the revenue obtained by the deterministic version.  相似文献   

3.
Parallel coordinate plots (PCPs) are commonly used in information visualization to provide insight into multi-variate data. These plots help to spot correlations between variables. PCPs have been successfully applied to unstructured datasets up to a few millions of points. In this paper, we present techniques to enhance the usability of PCPs for the exploration of large, multi-timepoint volumetric data sets, containing tens of millions of points per timestep. The main difficulties that arise when applying PCPs to large numbers of data points are visual clutter and slow performance, making interactive exploration infeasible. Moreover, the spatial context of the volumetric data is usually lost. We describe techniques for preprocessing using data quantization and compression, and for fast GPU-based rendering of PCPs using joint density distributions for each pair of consecutive variables, resulting in a smooth, continuous visualization. Also, fast brushing techniques are proposed for interactive data selection in multiple linked views, including a 3D spatial volume view. These techniques have been successfully applied to three large data sets: Hurricane Isabel (Vis'04 contest), the ionization front instability data set (Vis'08 design contest), and data from a large-eddy simulation of cumulus clouds. With these data, we show how PCPs can be extended to successfully visualize and interactively explore multi-timepoint volumetric datasets with an order of magnitude more data points.  相似文献   

4.
In this paper, we propose an active learning technique for solving multiclass problems with support vector machine (SVM) classifiers. The technique is based on both uncertainty and diversity criteria. The uncertainty criterion is implemented by analyzing the one-dimensional output space of the SVM classifier. A simple histogram thresholding algorithm is used to find out the low density region in the SVM output space to identify the most uncertain samples. Then the diversity criterion exploits the kernel k-means clustering algorithm to select uncorrelated informative samples among the selected uncertain samples. To assess the effectiveness of the proposed method we compared it with other batch mode active learning techniques presented in the literature using one toy data set and three real data sets. Experimental results confirmed that the proposed technique provided a very good tradeoff among robustness to biased initial training samples, classification accuracy, computational complexity, and number of new labeled samples necessary to reach the convergence.  相似文献   

5.
In this paper we propose two different time-scale separation based robust redesign techniques which recover the trajectories of a nominal control design in the presence of uncertain nonlinearities. We first consider additive input uncertainties and design a high-gain filter to estimate the uncertainty. We then employ the fast variables arising from this filter in the feedback control law to cancel the effect of the uncertainties in the plant. We next extend this design to systems with uncertain input nonlinearities in which case we design two sets of high gain filters—the first to estimate the input uncertainty over a fast time-scale, and the second to force this estimate to converge to the nominal input on an intermediate time-scale. Using singular perturbation theory we prove that the trajectories of the respective two-time-scale and three-time scale redesigned systems approach those of the nominal system when the filter gains are increased. We illustrate the redesigns by applying them to various physically motivated examples.  相似文献   

6.
Data uncertainty is inherent in emerging applications such as location-based services, sensor monitoring systems, and data integration. To handle a large amount of imprecise information, uncertain databases have been recently developed. In this paper, we study how to efficiently discover frequent itemsets from large uncertain databases, interpreted under the Possible World Semantics. This is technically challenging, since an uncertain database induces an exponential number of possible worlds. To tackle this problem, we propose a novel methods to capture the itemset mining process as a probability distribution function taking two models into account: the Poisson distribution and the normal distribution. These model-based approaches extract frequent itemsets with a high degree of accuracy and support large databases. We apply our techniques to improve the performance of the algorithms for (1) finding itemsets whose frequentness probabilities are larger than some threshold and (2) mining itemsets with the $k$ highest frequentness probabilities. Our approaches support both tuple and attribute uncertainty models, which are commonly used to represent uncertain databases. Extensive evaluation on real and synthetic datasets shows that our methods are highly accurate and four orders of magnitudes faster than previous approaches. In further theoretical and experimental studies, we give an intuition which model-based approach fits best to different types of data sets.  相似文献   

7.
An analysis of the process and human cognitive model of deception detection (DD) shows that DD is infused with uncertainty, especially in high-stake situations. There is a recent trend toward automating DD in computer-mediated communication. However, extant approaches to automatic DD overlook the importance of representation and reasoning under uncertainty in DD. They represent uncertain cues as crisp values and can only infer whether deception occurs, but not to what extent deception occurs. Based on uncertainty theories and the analyses of uncertainty in DD, we propose a model to represent cues and to reason for DD under uncertainty, and address the uncertainty due to imprecision and vagueness in DD using fuzzy sets and fuzzy logic. Neuro-fuzzy models were developed to discover knowledge for DD. The evaluation results on five data sets showed that the neuro-fuzzy method not only was a good alternative to traditional machine-learning techniques but also offered superior interpretability and reliability. Moreover, the gains of neuro-fuzzy systems over traditional systems became larger as the level of uncertainty associated with DD increased. The findings of this paper have theoretical, methodological, and practical implications to DD and fuzzy systems research.  相似文献   

8.
We introduce an approach to visualize stationary 2D vector fields with global uncertainty obtained by considering the transport of local uncertainty in the flow. For this, we extend the concept of vector field topology to uncertain vector fields by considering the vector field as a density distribution function. By generalizing the concepts of stream lines and critical points we obtain a number of density fields representing an uncertain topological segmentation. Their visualization as height surfaces gives insight into both the flow behavior and its uncertainty. We present a Monte Carlo approach where we integrate probabilistic particle paths, which lead to the segmentation of topological features. Moreover, we extend our algorithms to detect saddle points and present efficient implementations. Finally, we apply our technique to a number of real and synthetic test data sets.  相似文献   

9.
In this paper, we explore a novel idea of using high dynamic range (HDR) technology for uncertainty visualization. We focus on scalar volumetric data sets where every data point is associated with scalar uncertainty. We design a transfer function that maps each data point to a color in HDR space. The luminance component of the color is exploited to capture uncertainty. We modify existing tone mapping techniques and suitably integrate them with volume ray casting to obtain a low dynamic range (LDR) image. The resulting image is displayed on a conventional 8-bits-per-channel display device. The usage of HDR mapping reveals fine details in uncertainty distribution and enables the users to interactively study the data in the context of corresponding uncertainty information. We demonstrate the utility of our method and evaluate the results using data sets from ocean modeling.  相似文献   

10.
Measured data often incorporates some amount of uncertainty, which is generally modeled as a distribution of possible samples. In this paper, we consider second‐order symmetric tensors with uncertainty. In the 3D case, this means the tensor data consists of 6 coefficients – uncertainty, however, is encoded by 21 coefficients assuming a multivariate Gaussian distribution as model. The high dimension makes the direct visualization of tensor data with uncertainty a difficult problem, which was until now unsolved. The contribution of this paper consists in the design of glyphs for uncertain second‐order symmetric tensors in 2D and 3D. The construction consists of a standard glyph for the mean tensor that is augmented by a scalar field that represents uncertainty. We show that this scalar field and therefore the displayed glyph encode the uncertainty comprehensively, i.e., there exists a bijective map between the glyph and the parameters of the distribution. Our approach can extend several classes of existing glyphs for symmetric tensors to additionally encode uncertainty and therefore provides a possible foundation for further uncertain tensor glyph design. For demonstration, we choose the well‐known superquadric glyphs, and we show that the uncertainty visualization satisfies all their design constraints.  相似文献   

11.
Supporting ranking queries on uncertain and incomplete data   总被引:1,自引:0,他引:1  
Large databases with uncertain information are becoming more common in many applications including data integration, location tracking, and Web search. In these applications, ranking records with uncertain attributes introduces new problems that are fundamentally different from conventional ranking. Specifically, uncertainty in records’ scores induces a partial order over records, as opposed to the total order that is assumed in the conventional ranking settings. In this paper, we present a new probabilistic model, based on partial orders, to encapsulate the space of possible rankings originating from score uncertainty. Under this model, we formulate several ranking query types with different semantics. We describe and analyze a set of efficient query evaluation algorithms. We show that our techniques can be used to solve the problem of rank aggregation in partial orders under two widely adopted distance metrics. In addition, we design sampling techniques based on Markov chains to compute approximate query answers. Our experimental evaluation uses both real and synthetic data. The experimental study demonstrates the efficiency and effectiveness of our techniques under various configurations.  相似文献   

12.
Topological and geometrical methods constitute common tools for the analysis of high‐dimensional scientific data sets. Geometrical methods such as projection algorithms focus on preserving distances in the data set. Topological methods such as contour trees, by contrast, focus on preserving structural and connectivity information. By combining both types of methods, we want to benefit from their individual advantages. To this end, we describe an algorithm that uses persistent homology to analyse the topology of a data set. Persistent homology identifies high‐dimensional holes in data sets, describing them as simplicial chains. We localize these chains using geometrical information of the data set, which we obtain from geodesic distances on a neighbourhood graph. The localized chains describe the structure of point clouds. We represent them using an interactive graph, in which each node describes a single chain and its geometrical properties. This graph yields a more intuitive understanding of multivariate point clouds and simplifies comparisons of time‐varying data. Our method focuses on detecting and analysing inhomogeneous regions, i.e. holes, in a data set because these regions characterize data in a different manner, thereby leading to new insights. We demonstrate the potential of our method on data sets from particle physics, political science and meteorology.  相似文献   

13.
在不确定数据流聚类算法的研究中,位置不确定性是一种新的不确定数据类型.已有的不确定数据模型不能很好地描述和处理位置不确定数据.鉴于此,在提出基于联系数的位置不确定数据模型、联系距离函数、微簇密度可达性等主要概念的基础上,提出了一种联系数表达的位置不确定数据流聚类算法--UCNStream.数据流聚类算法采用在线/离线两级处理框架,使用基于密度峰值思想的初始化策略,定义了新的可动态维护的微簇聚类特征向量.利用衰减函数和微簇删除机制对微簇进行在线维护,准确地反映了数据流的演化过程.最后,分析了算法的计算复杂性,并通过对实际数据集上的实验与几种优秀的聚类算法进行了比较,实验结果表明,UCNStream算法具有较高的聚类精度和处理效率.  相似文献   

14.
Conventional Information Systems are limited in their ability to represent uncertain data. A consistent and useful methodology for representing and manipulating such data is required. One solution is proposed in this paper. Objects are modeled by selecting representative attributes to which values are assigned. Any attribute value can be one of the following: a regular precise value, a special value denoting “value unknown”, a special value denoting “attribute not applicable”, a range of values or a set of values. If there are uncertain data then the semantics of query evaluation are no longer clear and uncertainty is introduced. To handle the uncertainty two sets of objects are retrieved in response to each query: the set know to satisfy the query with complete certainty, and the set of objects which possibly satisfy the query with some degree of uncertainty. Two methods of estimating this uncertainty are examined.  相似文献   

15.
Visual analytics of multidimensional multivariate data is a challenging task because of the difficulty in understanding metrics in attribute spaces with more than three dimensions. Frequently, the analysis goal is not to look into individual records but to understand the distribution of the records at large and to find clusters of records with similar attribute values. A large number of (typically hierarchical) clustering algorithms have been developed to group individual records to clusters of statistical significance. However, only few visualization techniques exist for further exploring and understanding the clustering results. We propose visualization and interaction methods for analyzing individual clusters as well as cluster distribution within and across levels in the cluster hierarchy. We also provide a clustering method that operates on density rather than individual records. To not restrict our search for clusters, we compute density in the given multidimensional multivariate space. Clusters are formed by areas of high density. We present an approach that automatically computes a hierarchical tree of high density clusters. To visually represent the cluster hierarchy, we present a 2D radial layout that supports an intuitive understanding of the distribution structure of the multidimensional multivariate data set. Individual clusters can be explored interactively using parallel coordinates when being selected in the cluster tree. Furthermore, we integrate circular parallel coordinates into the radial hierarchical cluster tree layout, which allows for the analysis of the overall cluster distribution. This visual representation supports the comprehension of the relations between clusters and the original attributes. The combination of the 2D radial layout and the circular parallel coordinates is used to overcome the overplotting problem of parallel coordinates when looking into data sets with many records. We apply an automatic coloring scheme based on the 2D radial layout of the hierarchical cluster tree encoding hue, saturation, and value of the HSV color space. The colors support linking the 2D radial layout to other views such as the standard parallel coordinates or, in case data is obtained from multidimensional spatial data, the distribution in object space.  相似文献   

16.
Aggregation of imprecise and uncertain information in databases   总被引:4,自引:0,他引:4  
Information stored in a database is often subject to uncertainty and imprecision. Probability theory provides a well-known and well understood way of representing uncertainty and may thus be used to provide a mechanism for storing uncertain information in a database. We consider the problem of aggregation using an imprecise probability data model that allows us to represent imprecision by partial probabilities and uncertainty using probability distributions. Most work to date has concentrated on providing functionality for extending the relational algebra with a view to executing traditional queries on uncertain or imprecise data. However, for imprecise and uncertain data, we often require aggregation operators that provide information on patterns in the data. Thus, while traditional query processing is tuple-driven, processing of uncertain data is often attribute-driven where we use aggregation operators to discover attribute properties. The aggregation operator that we define uses the Kullback-Leibler information divergence between the aggregated probability distribution and the individual tuple values to provide a probability distribution for the domain values of an attribute or group of attributes. The provision of such aggregation operators is a central requirement in furnishing a database with the capability to perform the operations necessary for knowledge discovery in databases  相似文献   

17.
In this paper, we show how the formalism of Logic Programs with Ordered Disjunction (LPODs) and Possibilistic Answer Set Programming (PASP) can be merged into the single framework of Logic Programs with Possibilistic Ordered Disjunction (LPPODs). The LPPODs framework embeds in a unified way several aspects of common-sense reasoning, nonmonotonocity, preferences, and uncertainty, where each part is underpinned by a well established formalism. On one hand, from LPODs it inherits the distinctive feature of expressing context-dependent qualitative preferences among different alternatives (modeled as the atoms of a logic program). On the other hand, PASP allows for qualitative certainty statements about the rules themselves (modeled as necessity values according to possibilistic logic) to be captured. In this way, the LPPODs framework supports a reasoning which is nonmonotonic, preference- and uncertainty-aware. The LPPODs syntax allows for the specification of (1) preferences among the exceptions to default rules, and (2)?necessity values about the certainty of program rules. As a result, preferences and uncertainty can be used to select the preferred uncertain default rules of an LPPOD and, consequently, to order its possibilistic answer sets. Furthermore, we describe the implementation of an ASP-based solver able to compute the LPPODs semantics.  相似文献   

18.
We investigate how to represent the resulting multivariate information and multidimensional uncertainty by developing and applying candidate visual techniques. Although good techniques exist for visualizing many data types, less progress has been made on how to display uncertainty and multivariate information - this is especially true as the dimensionality rises. At this time, our primary focus is to develop the statistical characterizations for the environmental uncertainty (described only briefly in this article) and to develop a visual method for each characterization. The mariner community needs enhanced characterizations of environmental uncertainty now, but the accuracy of the characterizations is still not sufficient, and therefore formal user evaluations cannot take place at this point in development. We received feedback on the applicability of our techniques from domain experts. We used this in conjunction with previous results to compile a set of development guidelines.  相似文献   

19.
Radial axes plots are multivariate visualization techniques that extend scatterplots in order to represent high‐dimensional data as points on an observable display. Well‐known methods include star coordinates or principal component biplots, which represent data attributes as vectors that define axes, and produce linear dimensionality reduction mappings. In this paper we propose a hybrid approach that bridges the gap between star coordinates and principal component biplots, which we denominate “adaptable radial axes plots”. It is based on solving convex optimization problems where users can: (a) update the axis vectors interactively, as in star coordinates, while producing mappings that enable to estimate attribute values optimally through labeled axes, similarly to principal component biplots; (b) use different norms in order to explore additional nonlinear mappings of the data; and (c) include weights and constraints in the optimization problems for sorting the data along one axis. The result is a flexible technique that complements, extends, and enhances current radial methods for data analysis.  相似文献   

20.

Enabling information systems to face anomalies in the presence of uncertainty is a compelling and challenging task. In this work the problem of unsupervised outlier detection in large collections of data objects modeled by means of arbitrary multidimensional probability density functions is considered. We present a novel definition of uncertain distance-based outlier under the attribute level uncertainty model, according to which an uncertain object is an object that always exists but its actual value is modeled by a multivariate pdf. According to this definition an uncertain object is declared to be an outlier on the basis of the expected number of its neighbors in the dataset. To the best of our knowledge this is the first work that considers the unsupervised outlier detection problem on data objects modeled by means of arbitrarily shaped multidimensional distribution functions. We present the UDBOD algorithm which efficiently detects the outliers in an input uncertain dataset by taking advantages of three optimized phases, that are parameter estimation, candidate selection, and the candidate filtering. An experimental campaign is presented, including a sensitivity analysis, a study of the effectiveness of the technique, a comparison with related algorithms, also in presence of high dimensional data, and a discussion about the behavior of our technique in real case scenarios.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号