Analytical workloads in data warehouses often include heavy joins where queries involve multiple fact tables in addition to the typical star-patterns, dimensional grouping and selections. In this paper we propose a new processing and storage framework called bitwise dimensional co-clustering (BDCC) that avoids replication and thus keeps updates fast, yet is able to accelerate all these foreign key joins, efficiently support grouping and pushes down most dimensional selections. The core idea of BDCC is to cluster each table on a mix of dimensions, each possibly derived from attributes imported over an incoming foreign key and this way creating foreign key connected tables with partially shared clusterings. These are later used to accelerate any join between two tables that have some dimension in common and additionally permit to push down and propagate selections (reduce I/O) and accelerate aggregation and ordering operations. Besides the general framework, we describe an algorithm to derive such a physical co-clustering database automatically and describe query processing and query optimization techniques that can easily be fitted into existing relational engines. We present an experimental evaluation on the TPC-H benchmark in the Vectorwise system, showing that co-clustering can significantly enhance its already high performance and at the same time significantly reduce the memory consumption of the system. 相似文献
We propose simple models to predict the performance degradation of disk requests due to storage device contention in consolidated virtualized environments. Model parameters can be deduced from measurements obtained inside Virtual Machines (VMs) from a system where a single VM accesses a remote storage server. The parameterized model can then be used to predict the effect of storage contention when multiple VMs are consolidated on the same server. We first propose a trace-driven approach that evaluates a queueing network with fair share scheduling using simulation. The model parameters consider Virtual Machine Monitor level disk access optimizations and rely on a calibration technique. We further present a measurement-based approach that allows a distinct characterization of read/write performance attributes. In particular, we define simple linear prediction models for I/O request mean response times, throughputs and read/write mixes, as well as a simulation model for predicting response time distributions. We found our models to be effective in predicting such quantities across a range of synthetic and emulated application workloads. 相似文献
Group-wise registration of a set of shapes represented by unlabeled point-sets is a challenging problem since, usually this involves solving for point correspondence in a nonrigid motion setting. In this paper, we propose a novel and robust algorithm that is capable of simultaneously computing the mean shape represented by a probability density function from multiple unlabeled point-sets and registering them non-rigidly to this emerging mean shape. This algorithm avoids the correspondence problem by minimizing the Jensen-Shannon (JS) divergence between the point sets. We motivate the use of the JS divergence by pointing out its close relationship to hypothesis testing. We derive the analytic gradient of the cost function in order to efficiently achieve the optimal solution. JS-divergence is symmetric with no bias toward any of the given shapes to be registered and whose mean is being sought. A by product of the registration process is a probabilistic atlas defined as the convex combination of the probability densities of the input point sets being aligned. Our algorithm can be especially useful for creating atlases of various shapes present in images as well as for simultaneously (rigidly or non-rigidly) registering 3D range data sets without having to establish any correspondence. We present experimental results on real and synthetic data. 相似文献
In recent years, we face an increasing interest in protecting multimedia data and copyrights due to the high exchange of information. Attackers are trying to get confidential information from various sources, which brings the importance of securing the data. Many researchers implemented techniques to hide secret information to maintain the integrity and privacy of data. In order to protect confidential data, histogram-based reversible data hiding with other cryptographic algorithms are widely used. Therefore, in the proposed work, a robust method for securing digital video is suggested. We implemented histogram bit shifting based reversible data hiding by embedding the encrypted watermark in featured video frames. Histogram bit shifting is used for hiding highly secured watermarks so that security for the watermark symbol is also being achieved. The novelty of the work is that only based on the quality threshold a few unique frames are selected, which holds the encrypted watermark symbol. The optimal value for this threshold is obtained using the Firefly Algorithm. The proposed method is capable of hiding high-capacity data in the video signal. The experimental result shows the higher capacity and video quality compared to other reversible data hiding techniques. The recovered watermark provides better identity identification against various attacks. A high value of PSNR and a low value of BER and MSE is reported from the results.
In this paper, we investigate the application of Evolving Trees (ET) for the analysis of mass spectrometric data of bacteria.
Evolving Trees are extensions of self-organizing maps (SOMs) developed for hierarchical classification systems. Therefore,
they are well suited for taxonomic problems such as the identification of bacteria. Here, we focus on three topics, an appropriate
pre-processing and encoding of the spectra, an adequate data model by means of a hierarchical Evolving Tree and an interpretable
visualization. First, the high dimensionality of the data is reduced by a compact representation. Here, we employ sparse coding,
specifically tailored for the processing of mass spectra. In the second step, the topographic information which is expected
in the fingerprints is used for advanced tree evaluation and analysis. We adapted the original topographic product for SOMs
for ET to achieve a judgment of topography. Additionally we transferred the concept of U-matrix for evaluation of the separability
of SOMs to their analog in ET. We demonstrate these extensions for two mass spectrometric data sets of bacteria fingerprints
and show their classification and evaluation capabilities in comparison to state of the art techniques. 相似文献
This study explores how distributing the controls of a video game among multiple players affects the sociality and engagement experienced in game play. A video game was developed in which the distribution of game controls among the players could be varied, thereby affecting the abilities of the individual players to control the game. An experiment was set up in which eight groups of three players were asked to play the video game while the distribution of the game controls was increased in three steps. After each playing session, the players’ experiences of sociality and engagement were assessed using questionnaires. The results showed that distributing game control among the players increased the level of experienced sociality and reduced the level of experienced control. The game in which the controls were partly distributed led to the highest levels of experienced engagement, because the game allowed social play while still giving the players a sense of autonomy. The implications for interaction design are discussed. 相似文献
Service orientation has been a major buzz-word in recent years. While the buzz is on a decline, organizations are slowly, but steadily moving towards service oriented designs. However, service orientation turns out to be as much of a managerial challenge as of a technical one. The most important complexity drivers in the service oriented design of information systems seem to be (a) inconsistent design goals of stakeholders and (b) the pursuit of exhaustive service orientation coverage. This research focuses on the following two questions: (1) What are the characteristics of successful implementations of service oriented information systems, and (2) what are the critical success factors influencing, driving and/or determining these characteristics? Data of an empirical analysis is used to test a set of cause-effect relationship hypotheses based on nine latent variables. In the core of this model we differentiate the variables ??overall service orientation infrastructure success?? and ??service orientation project success??. The hypothesized interrelationships between the nine variables lead to a causal model which is proven to hold. 相似文献