期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Improved visual correlation analysis for multidimensional data

《Journal of Visual Languages and Computing》2017

With the era of data explosion coming, multidimensional visualization, as one of the most helpful data analysis technologies, is more frequently applied to the tasks of multidimensional data analysis. Correlation analysis is an efficient technique to reveal the complex relationships existing among the dimensions in multidimensional data. However, for the multidimensional data with complex dimension features,traditional correlation analysis methods are inaccurate and limited. In this paper, we introduce the improved Pearson correlation coefficient and mutual information correlation analysis respectively to detect the dimensions’ linear and non-linear correlations. For the linear case,all dimensions are classified into three groups according to their distributions. Then we correspondingly select the appropriate parameters for each group of dimensions to calculate their correlations. For the non-linear case,we cluster the data within each dimension. Then their probability distributions are calculated to analyze the dimensions’ correlations and dependencies based on the mutual information correlation analysis. Finally,we use the relationships between dimensions as the criteria for interactive ordering of axes in parallel coordinate displays. 相似文献

2.

A balanced hierarchical data structure for multidimensional datawith highly efficient dynamic characteristics

Nakamura Y. Abe S. Ohsawa Y. Sakauchi M. 《Knowledge and Data Engineering, IEEE Transactions on》1993,5(4):682-694

A new multidimensional data structure, multidimensional tree (MD-tree), is proposed. The MD-tree is developed by extending the concept of the B-tree to the multidimensional data, so that the MD-tree is a height balanced tree similar to the B-tree. The theoretical worst-case storage utilization is guaranteed to hold more than 66.7% (2/3) of full capacity. The structure of the MD-tree and the algorithms to perform the insertion, deletion, and spatial searching are described. By the series of simulation tests, the performances of the MD-tree and conventional methods are compared. The results indicate that storage utilization is more than 80% in practice, and that retrieval performance and dynamic characteristics are superior to conventional methods 相似文献

3.

Construction and distribution of materialized views in Non-binary data space

Roy Santanu Shit Bibekananda Sen Soumya Cortesi Agostino 《Innovations in Systems and Software Engineering》2021,17(3):205-217

Innovations in Systems and Software Engineering - Materialized views are heavily used to speed up the query response time of any data centric application. In the literature, the construction and... 相似文献

4.

ApproxCCA: An approximate correlation analysis algorithm for multidimensional data streams

Yongli Wang Gongxuan Zhang Jiang-Bo Qian 《Knowledge》2011,24(7):952-962

Correlation analysis is regarded as a significant challenge in the mining of multidimensional data streams. Great emphasis is generally placed on one-dimensional data streams with the existing correlation analysis methods for the mining of data streams. Therefore, the identification of underlying correlation among multivariate arrays (e.g. Sensor data) has long been ignored. The technique of canonical correlation analysis (CCA) has rarely been applied in multidimensional data streams. In this study, a novel correlation analysis algorithm based on CCA, called ApproxCCA, is proposed to explore the correlations between two multidimensional data streams in the environment with limited resources. By introducing techniques of unequal probability sampling and low-rank approximation to reduce the dimensionality of the product matrix composed by the sample covariance matrix and sample variance matrix, ApproxCCA successfully improves computational efficiency while ensuring the analytical precision. Experimental results of synthetic and real data sets have indicated that the computational bottleneck of traditional CCA can be overcome with ApproxCCA, and the correlations between two multidimensional data streams can also be detected accurately. 相似文献

5.

Materialization of fragmented views in multidimensional databases

Matteo Golfarelli Vittorio Maniezzo Stefano Rizzi 《Data & Knowledge Engineering》2004,49(3):135-351

The most effective technique to enhance performances of multidimensional databases consists in materializing redundant aggregates called views. In the classical approach to materialization, each view includes all and only the measures of the cube it aggregates. In this paper we investigate the benefits of materializing views in vertical fragments, aimed at minimizing the workload response time. We formalize the fragmentation problem as a 0–1 integer linear programming problem, which is then solved by means of a standard integer programming solver to determine the optimal fragmentation for a given workload. Finally, we demonstrate the usefulness of fragmentation by presenting a large set of experimental results based on the TPC-H benchmark. 相似文献

6.

A multidimensional and hierarchical model of mobile service quality

Yaobin Lu Long Zhang Bin Wang 《Electronic Commerce Research and Applications》2009,8(5):228-240

Using mobile brokerage service as an example, we propose and test a multidimensional and hierarchical model of mobile service (m-service) quality using a sample of 338 respondents from the two largest m-service providers in China: China Mobile and China Unicom. Through three-stage validation, we are able to confirm all three levels of the proposed hierarchical structure where a customer’s perceived m-service quality includes primary dimensions of interaction, outcome, and environment qualities. Each primary dimension further has its sub-dimensions. Our empirical results also show that corporate image moderates the effects of environment and outcome qualities on the service quality. Our proposed model provides implications for future research on mobile commerce. 相似文献

7.

Using data compression for multidimensional distribution analysis [molecular biology]

Onizuka K. Noguchi T. Akiyama Y. Matsuda H. 《Intelligent Systems, IEEE》2002,17(3):48-54

A new method for multi-dimensional distribution analysis using a data compression technique applied to the knowledge-based mean-force potentials between residues for the analysis of protein sequence-structure compatibility performs much better than that of conventional 1D distance-based potentials derived from binned distributions. 相似文献

8.

The generalized unified computation of multidimensional discrete orthogonal transforms

成礼智蒋增荣张振慧《中国科学F辑(英文版)》2001,44(6):401-411

By introducing a form of reorder for multidimensional data, we propose a unified fast algo-rithm that jointly employs one-dimensional W transform and multidimensional discrete polynomial trans-form to compute eleven types of multidimensional discrete orthogonal transforms, which contain three types of m-dimensional discrete cosine transforms ( m-D DCTs) ,four types of m-dimensional discrete W transforms ( m-D DWTs) ( m-dimensional Hartley transform as a special case), and four types of generalized discrete Fourier transforms ( m-D GDFTs). For real input, the number of multiplications for all eleven types of the m-D discrete orthogonal transforms needed by the proposed algorithm are only 1/m times that of the commonly used corresponding row-column methods, and for complex input, it is further reduced to 1/(2m) times. The number of additions required is also reduced considerably. Furthermore, the proposed algorithm has a simple computational structure and is also easy to be im-plemented on computer, and th 相似文献

9.

Fitting sparse multidimensional data with low-dimensional terms

Sergei Manzhos 《Computer Physics Communications》2009,180(10):2002-17955

An algorithm that fits a continuous function to sparse multidimensional data is presented. The algorithm uses a representation in terms of lower-dimensional component functions of coordinates defined in an automated way and also permits dimensionality reduction. Neural networks are used to construct the component functions.

Program summary

Program title: RS_HDMR_NNCatalogue identifier: AEEI_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEI_v1_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 19 566No. of bytes in distributed program, including test data, etc.: 327 856Distribution format: tar.gzProgramming language: MatLab R2007bComputer: any computer running MatLabOperating system: Windows XP, Windows Vista, UNIX, LinuxClassification: 4.9External routines: Neural Network Toolbox Version 5.1 (R2007b)Nature of problem: Fitting a smooth, easily integratable and differentiatable, function to a very sparse (∼2-3 points per dimension) multidimensional (D?6) large (∼¹⁰⁴-¹⁰⁵ data) dataset.Solution method: A multivariate function is represented as a sum of a small number of terms each of which is a low-dimensional function of optimised coordinates. The optimal coordinates reduce both the dimensionality and the number of the terms. Neural networks (including exponential neurons) are used to obtain a general and robust method and a functional form which is easily differentiated and integrated (in the case of exponential neurons).Running time: Depends strongly on the dataset to be modelled and the chosen structure of the approximating function, ranges from about a minute for ∼¹⁰³ data in 3-D to about a day for ∼¹⁰⁵ data in 15-D. 相似文献

10.

Random indexing of multidimensional data

Fredrik Sandin Blerim Emruli Magnus Sahlgren 《Knowledge and Information Systems》2017,52(1):267-290

Random indexing (RI) is a lightweight dimension reduction method, which is used, for example, to approximate vector semantic relationships in online natural language processing systems. Here we generalise RI to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis. The RI method is well suited for online processing of data streams because relationship weights can be updated incrementally in a fixed-size distributed representation, and inner products can be approximated on the fly at low computational cost. An open source implementation of generalised RI is provided. 相似文献

11.

Management of multidimensional discrete data 总被引：1，自引：0，他引：1

Peter Baumann Ph.D. 《The VLDB Journal The International Journal on Very Large Data Bases》1994,3(4):401-444

Spatial database management involves two main categories of data: vector and raster data. The former has received a lot of in-depth investigation; the latter still lacks a sound framework. Current DBMSs either regard raster data as pure byte sequences where the DBMS has no knowledge about the underlying semantics, or they do not complement array structures with storage mechanisms suitable for huge arrays, or they are designed as specialized systems with sophisticated imaging functionality, but no general database capabilities (e.g., a query language). Many types of array data will require database support in the future, notably 2-D images, audio data and general signal-time series (1-D), animations (3-D), static or time-variant voxel fields (3-D and 4-D), and the ISO/IEC PIKS (Programmer's Imaging Kernel System) BasicImage type (5-D). In this article, we propose a comprehensive support ofmultidimensional discrete data (MDD) in databases, including operations on arrays of arbitrary size over arbitrary data types. A set of requirements is developed, a small set of language constructs is proposed (based on a formal algebraic semantics), and a novel MDD architecture is outlined to provide the basis for efficient MDD query evaluation. 相似文献

12.

Designing hierarchical sensor networks with mobile data collectors

Ataul Bari Ying Chen Debashis Roy Arunita Jaekel Subir Bandyopadhyay 《Pervasive and Mobile Computing》2011,7(1):128-139

相似文献

13.

Permutation inversions and multidimensional cumulative distribution functions

Johnson M. Hart 《Information Processing Letters》1982,14(5):218-222

The problem of computing the empirical cumulative distribution function (ECDF) of N points in k-dimensional space has been studied and motivated recently by Bentley [1], whose solution uses recursive multidimensional divide-and-conquer. In this paper, the problem is treated as a generalization of the problem of computing the inversion of a permutation. An algorithm of Knuth [3] is then extended to yield an O(kN(log₂N)^k?1) solution to the ECDF problem, which is comparable to Bentley's solution. Neither solution approaches the O(kN log₂N) lower bound, and they are worse than the O(kN²) ‘brute force’ algorithm for large k. The new algorithm, however, has the advantage of being highly parallel so that fast solution exists with parallel processors. 相似文献

14.

Model-based multidimensional clustering of categorical data

Tao Chen Nevin L. Zhang Tengfei Liu Kin Man Poon Yi Wang 《Artificial Intelligence》2012,176(1):2246-2269

Existing models for cluster analysis typically consist of a number of attributes that describe the objects to be partitioned and one single latent variable that represents the clusters to be identified. When one analyzes data using such a model, one is looking for one way to cluster data that is jointly defined by all the attributes. In other words, one performs unidimensional clustering. This is not always appropriate. For complex data with many attributes, it is more reasonable to consider multidimensional clustering, i.e., to partition data along multiple dimensions. In this paper, we present a method for performing multidimensional clustering on categorical data and show its superiority over unidimensional clustering. 相似文献

15.

A unified hierarchical algorithm for global illumination withscattering volumes and object clusters

Sillion F.X. 《IEEE transactions on visualization and computer graphics》1995,1(3):240-254

The paper presents a new radiosity algorithm that allows the simultaneous computation of energy exchanges between surface elements, scattering volume distributions, and groups of surfaces, or object clusters. The new technique is based on a hierarchical formulation of the zonal method, and efficiently integrates volumes and surfaces. In particular no initial linking stage is needed, even for inhomogeneous volumes, thanks to the construction of a global spatial hierarchy. An analogy between object clusters and scattering volumes results in a powerful clustering radiosity algorithm, with no initial linking between surfaces and fast computation of average visibility information through a cluster. We show that the accurate distribution of the energy emitted or received at the cluster level can produce even better results than isotropic clustering at a marginal cost. The resulting algorithm is fast and, more importantly, truly progressive as it allows the quick calculation of approximate solutions with a smooth convergence towards very accurate simulations 相似文献

16.

Evaluation measures for hierarchical classification: a unified view and novel approaches

Aris Kosmopoulos Ioannis Partalas Eric Gaussier Georgios Paliouras Ion Androutsopoulos 《Data mining and knowledge discovery》2015,29(3):820-865

相似文献

17.

Multisource surveillance video data coding with hierarchical knowledge library

Chen Yu Hu Ruimin Xiao Jing Xu Liang Wang Zhongyuan 《Multimedia Tools and Applications》2019,78(11):14705-14731

The rapidly increasing surveillance video data has challenged the existing video coding standards. Even though knowledge based video coding scheme has been proposed to remove redundancy of moving objects across multiple videos and achieved great coding efficiency improvement, it still has difficulties to cope with complicated visual changes of objects resulting from various factors. In this paper, a novel hierarchical knowledge extraction method is proposed. Common knowledge on three coarse-to-fine levels, namely category level, object level and video level, are extracted from history data to model the initial appearance, stable changes and temporal changes respectively for better object representation and redundancy removal. In addition, we apply the extracted hierarchical knowledge to surveillance video coding tasks and establish a hybrid prediction based coding framework. On the one hand, hierarchical knowledge is projected to the image plane to generate reference for I frames to achieve better prediction performance. On the other hand, we develop a transform based prediction for P/B frames to reduce the computational complexity while improve the coding efficiency. Experimental results demonstrate the effectiveness of our proposed method.

相似文献

18.

Identification of the hierarchical data structure

I. E. Shepelev 《Pattern Recognition and Image Analysis》2011,21(2):211-214

The task of identifying a hierarchical data structure is considered for the example of the problem of identifying personalizing reference characteristics. A model of a neural network based on radial basis functions is proposed as a possible solution of the task. The identification of the hierarchical dependence is practically aimed to create a classifier using a restricted set of input variables compared to the flat structured classifier. A multilayer perceptron is used as local classifiers. We also use self-organizing maps to visually show data structuredness. 相似文献

19.

Knowledge transfer across different domain data with multiple views

Qi Tan Huifang Deng Pei Yang 《Neural computing & applications》2014,25(1):15-23

In many real-world applications in the areas of data mining, the distributions of testing data are different from that of training data. And on the other hand, many data are often represented by multiple views which are of importance to learning. However, little work has been done for it. In this paper, we explored to leverage the multi-view information across different domains for knowledge transfer. We proposed a novel transfer learning model which integrates the domain distance and view consistency into a 2-view support vector machine framework, namely DV2S. The objective of DV2S is to find the optimal feature mapping such that under the projections the classification margin is maximized, while both the domain distance and the disagreement between multiple views are minimized simultaneously. Experiments showed that DV2S outperforms a variety of state-of-the-art algorithms. 相似文献

20.

Comparative analysis of multidimensional, quantitative data

Lex A Streit M Partl C Kashofer K Schmalstieg D 《IEEE transactions on visualization and computer graphics》2010,16(6):1027-1035

When analyzing multidimensional, quantitative data, the comparison of two or more groups of dimensions is a common task. Typical sources of such data are experiments in biology, physics or engineering, which are conducted in different configurations and use replicates to ensure statistically significant results. One common way to analyze this data is to filter it using statistical methods and then run clustering algorithms to group similar values. The clustering results can be visualized using heat maps, which show differences between groups as changes in color. However, in cases where groups of dimensions have an a priori meaning, it is not desirable to cluster all dimensions combined, since a clustering algorithm can fragment continuous blocks of records. Furthermore, identifying relevant elements in heat maps becomes more difficult as the number of dimensions increases. To aid in such situations, we have developed Matchmaker, a visualization technique that allows researchers to arbitrarily arrange and compare multiple groups of dimensions at the same time. We create separate groups of dimensions which can be clustered individually, and place them in an arrangement of heat maps reminiscent of parallel coordinates. To identify relations, we render bundled curves and ribbons between related records in different groups. We then allow interactive drill-downs using enlarged detail views of the data, which enable in-depth comparisons of clusters between groups. To reduce visual clutter, we minimize crossings between the views. This paper concludes with two case studies. The first demonstrates the value of our technique for the comparison of clustering algorithms. In the second, biologists use our system to investigate why certain strains of mice develop liver disease while others remain healthy, informally showing the efficacy of our system when analyzing multidimensional data containing distinct groups of dimensions. 相似文献