期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Reconciling requirement-driven data warehouses with data sources via multidimensional normal forms

《Data & Knowledge Engineering》2008,64(3):725-751

Successful data warehouse (DW) design needs to be based upon a requirement analysis phase in order to adequately represent the information needs of DW users. Moreover, since the DW integrates the information provided by data sources, it is also crucial to take these sources into account throughout the development process to obtain a consistent reconciliation of data sources and information needs. In this paper, we start by summarizing our approach to specify user requirements for data warehouses and to obtain a conceptual multidimensional model capturing these requirements. Then, we make use of the multidimensional normal forms to define a set of Query/View/Transformation (QVT) relations to assure that the conceptual multidimensional model obtained from user requirements agrees with the available data sources that will populate the DW. Thus, we propose a hybrid approach to develop DWs, i.e., we firstly obtain the conceptual multidimensional model of the DW from user requirements and then we verify and enforce its correctness against data sources by using a set of QVT relations based on multidimensional normal forms. Finally, we provide some snapshots of the CASE tool we have used to implement our QVT relations. 相似文献

2.

A requirement-driven approach to the design and evolution of data warehouses

《Information Systems》2014

Designing data warehouse (DW) systems in highly dynamic enterprise environments is not an easy task. At each moment, the multidimensional (MD) schema needs to satisfy the set of information requirements posed by the business users. At the same time, the diversity and heterogeneity of the data sources need to be considered in order to properly retrieve needed data. Frequent arrival of new business needs requires that the system is adaptable to changes. To cope with such an inevitable complexity (both at the beginning of the design process and when potential evolution events occur), in this paper we present a semi-automatic method called ORE, for creating DW designs in an iterative fashion based on a given set of information requirements. Requirements are first considered separately. For each requirement, ORE expects the set of possible MD interpretations of the source data needed for that requirement (in a form similar to an MD schema). Incrementally, ORE builds the unified MD schema that satisfies the entire set of requirements and meet some predefined quality objectives. We have implemented ORE and performed a number of experiments to study our approach. We have also conducted a limited-scale case study to investigate its usefulness to designers. 相似文献

3.

Specification-based data reduction in dimensional data warehouses

Janne Skyt Christian S. Jensen Torben Bach Pedersen 《Information Systems》2008

Many data warehouses contain massive amounts of data, accumulated over long periods of time. In some cases, it is necessary or desirable to either delete “old” data or to maintain the data at an aggregate level. This may be due to privacy concerns, in which case the data are aggregated to levels that ensure anonymity. Another reason is the desire to maintain a balance between the uses of data that change as the data age and the size of the data, thus avoiding overly large data warehouses. This paper presents effective techniques for data reduction that enable the gradual aggregation of detailed data as the data ages. With these techniques, data may be aggregated to higher levels as they age, enabling the maintenance of more compact, consolidated data and the compliance with privacy requirements. Special care is taken to avoid semantic problems in the aggregation process. The paper also describes the querying of the resulting data warehouses and an implementation strategy based on current database technology. 相似文献

4.

Extending OCL for OLAP querying on conceptual multidimensional models of data warehouses

Jesús Pardillo Jose-Norberto Mazón 《Information Sciences》2010,180(5):584-5028

The development of data warehouses begins with the definition of multidimensional models at the conceptual level in order to structure data, which will facilitate decision makers with an easier data analysis. Current proposals for conceptual multidimensional modelling focus on the design of static data warehouse structures, but few approaches model the queries which the data warehouse should support by means of OLAP (on-line analytical processing) tools. OLAP queries are, therefore, only defined once the rest of the data warehouse has been implemented, which prevents designers from verifying from the very beginning of the development whether the decision maker will be able to obtain the required information from the data warehouse. This article presents a solution to this drawback consisting of an extension to the object constraint language (OCL), which has been developed to include a set of predefined OLAP operators. These operators can be used to define platform-independent OLAP queries as a part of the specification of the data warehouse conceptual multidimensional model. Furthermore, OLAP tools require the implementation of queries to assure performance optimisations based on pre-aggregation. It is interesting to note that the OLAP queries defined by our approach can be automatically implemented in the rest of the data warehouse, in a coherent and integrated manner. This implementation is supported by a code-generation architecture aligned with model-driven technologies, in particular the MDA (model-driven architecture) proposal. Finally, our proposal has been validated by means of a set of sample data sets from a well-known case study. 相似文献

5.

A UML profile for multidimensional modeling in data warehouses

Sergio Juan Il-Yeol 《Data & Knowledge Engineering》2006,59(3):725-769

The multidimensional (MD) modeling, which is the foundation of data warehouses (DWs), MD databases, and On-Line Analytical Processing (OLAP) applications, is based on several properties different from those in traditional database modeling. In the past few years, there have been some proposals, providing their own formal and graphical notations, for representing the main MD properties at the conceptual level. However, unfortunately none of them has been accepted as a standard for conceptual MD modeling.

In this paper, we present an extension of the Unified Modeling Language (UML) using a UML profile. This profile is defined by a set of stereotypes, constraints and tagged values to elegantly represent main MD properties at the conceptual level. We make use of the Object Constraint Language (OCL) to specify the constraints attached to the defined stereotypes, thereby avoiding an arbitrary use of these stereotypes. We have based our proposal in UML for two main reasons: (i) UML is a well known standard modeling language known by most database designers, thereby designers can avoid learning a new notation, and (ii) UML can be easily extended so that it can be tailored for a specific domain with concrete peculiarities such as the multidimensional modeling for data warehouses. Moreover, our proposal is Model Driven Architecture (MDA) compliant and we use the Query View Transformation (QVT) approach for an automatic generation of the implementation in a target platform. Throughout the paper, we will describe how to easily accomplish the MD modeling of DWs at the conceptual level. Finally, we show how to use our extension in Rational Rose for MD modeling. 相似文献

6.

Developing secure data warehouses with a UML extension

Eduardo Fernández-Medina Juan Trujillo Rodolfo Villarroel Mario Piattini 《Information Systems》2007

Data Warehouses (DWs), Multidimensional (MD) Databases, and On-Line Analytical Processing Applications are used as a very powerful mechanism for discovering crucial business information. Considering the extreme importance of the information managed by these kinds of applications, it is essential to specify security measures from the early stages of the DW design in the MD modeling process, and enforce them. In the past years, some proposals for representing main MD modeling properties at the conceptual level have been stated. Nevertheless, none of these proposals considers security issues as an important element in its model, so they do not allow us to specify confidentiality constraints to be enforced by the applications that will use these MD models. In this paper, we will discuss the specific confidentiality problems regarding DWs as well as present an extension of the Unified Modeling Language for specifying security constraints in the conceptual MD modeling, thereby allowing us to design secure DWs. One key advantage of our approach is that we accomplish the conceptual modeling of secure DWs independently of the target platform where the DW has to be implemented, allowing the implementation of the corresponding DWs on any secure commercial database management system. Finally, we will present a case study to show how a conceptual model designed with our approach can be directly implemented on top of Oracle 10g. 相似文献

7.

From conceptual models to schemata: An object-process-based data warehouse construction method

Dov Dori Roman Feldman Arnon Sturm 《Information Systems》2008

Data warehouse modeling is a complex task, which involves knowledge of business processes of the domain of discourse, understanding the structural and behavioral system's conceptual model, and familiarity with data warehouse technologies. The suitability of current data warehouse modeling methods for large-scale systems is questionable, as they require multiple manual actions to discover measures and relevant dimensional entities and they tend to disregard the system's dynamic aspects. We present an Object-process-based Data Warehouse Construction (ODWC) method that overcomes these limitations of existing methods by utilizing the operational system conceptual model to construct a corresponding data warehouse schema. We specify the ODWC method, apply it on a case study, evaluate it, and compare it to existing methods. 相似文献

8.

Comparing data type specifications via their normal forms

J. L. Remy P. A. S. Veloso 《International journal of parallel programming》1982,11(3):141-153

A simple technique is presented for verifying that two abstract data type specifications are equivalent in that they have isomorphic initial algebras. The method uses normal forms to attempt reducing the number of equations to be checked. It is applied to a simple example and some extensions, and related problems are also discussed. 相似文献

9.

Graphical normal forms based on root dependencies in relational data base systems

Sudhir K. Arora K. C. Smith 《International journal of parallel programming》1981,10(4):235-259

Normal forms and dependencies are an area of great current interest in the design of relational data bases. Only a subclass, namely, root dependencies and the normal forms based on them, are of direct interest to the data base designer. Dependencies outside this subclass do not have clear cut semantics and may in the long run prove to be of theoretical interest only. We have proposed the fifth normal form (5NF) to control the pattern of codependancy, the highest known root dependency. We have also shown a strong parallel between root dependencies and their normal forms and a family of hypergraphs calledS-diagrams. Graphical normal forms, based onS-diagrams have been proposed and their equivalence to conventional normal forms proved.This work was supported in part by the Science and Engineering Research Board Grant Number 214–7248. 相似文献

10.

A data warehouse to explore multidimensional simulated data from a spatially distributed agro-hydrological model to improve catchment nitrogen management

《Environmental Modelling & Software》2017

相似文献

11.

Multidimensional data modeling for location-based services 总被引：5，自引：0，他引：5

Christian?S.?Jensen Email author Augustas?Kligys Torben?Bach?Pedersen Igor?Timko 《The VLDB Journal The International Journal on Very Large Data Bases》2004,13(1):1-21

With the recent and continuing advances in areas such as wireless communications and positioning technologies, mobile, location-based services are becoming possible.Such services deliver location-dependent content to their users. More specifically, these services may capture the movements and requests of their users in multidimensional databases, i.e., data warehouses, and content delivery may be based on the results of complex queries on these data warehouses. Such queries aggregate detailed data in order to find useful patterns, e.g., in the interaction of a particular user with the services.The application of multidimensional technology in this context poses a range of new challenges. The specific challenge addressed here concerns the provision of an appropriate multidimensional data model. In particular, the paper extends an existing multidimensional data model and algebraic query language to accommodate spatial values that exhibit partial containment relationships instead of the total containment relationships normally assumed in multidimensional data models. Partial containment introduces imprecision in aggregation paths. The paper proposes a method for evaluating the imprecision of such paths. The paper also offers transformations of dimension hierarchies with partial containment relationships to simple hierarchies, to which existing precomputation techniques are applicable.Received: 28 September 2002, Accepted: 5 April 2003, Published online: 12 August 2003Edited by: J. Veijalainen Correspondence to: I. Timko 相似文献

12.

Strict feedforward control systems,linearisability, and convergent normal forms

Issa Amadou Tall 《International journal of control》2013,86(10):1994-2011

This article discusses the feedback equivalence of multi-inputs feedforward control systems via smooth (resp. analytic) feedback transformations. We first address the state (resp. feedback) linearisation problem, and provide easily computable algorithms that yield explicit state (resp. feedback) linearising coordinates for systems in strict feedforward form. The application of the algorithms does not require checking the commutativity (resp. involutivity) of the distributions associated with the system, and the algorithms fail after few steps if the system is not linearisable. In the latter case, the algorithms are extended to provide coordinate systems bringing the system into a normal form which is a smooth (resp. analytic) counterpart of Kang's formal normal form. Illustrative examples for both the linearisation and convergent normal form include the vertical take off and landing aircraft, the multi-vehicle wireless testbed among others. 相似文献

13.

Semi-global stabilization of minimum phase nonlinear systems in special normal forms

Andrew R. Teel 《Systems & Control Letters》1992,19(3)

We semi-globally stabilize certain minimum phase nonlinear systems which are in a normal form where the nonlinear subsystem is driven by an output of a linear system that possesses (possibly) nonzero peaking exponents. We eliminate the peaking phenomenon by stabilizing part of the linear system with a high-gain linear control and part of the linear system with a small, bounded control. The interpretation of this approach will be that we are redefining the outputs to add asymptotically stable nonlinear zeros to the system in a manner that allows the new composite zero dynamics to be asymptotically stable on arbitrarily large compact sets. 相似文献

14.

Optimizing multiple dimensional queries simultaneously in multidimensional databases 总被引：1，自引：0，他引：1

Weifa Liang Maria E. Orlowska Jeffrey X. Yu 《The VLDB Journal The International Journal on Very Large Data Bases》2000,8(3-4):319-338

Some significant progress related to multidimensional data analysis has been achieved in the past few years, including the design of fast algorithms for computing datacubes, selecting some precomputed group-bys to materialize, and designing efficient storage structures for multidimensional data. However, little work has been carried out on multidimensional query optimization issues. Particularly the response time (or evaluation cost) for answering several related dimensional queries simultaneously is crucial to the OLAP applications. Recently, Zhao et al. first exploited this problem by presenting three heuristic algorithms. In this paper we first consider in detail two cases of the problem in which all the queries are either hash-based star joins or index-based star joins only. In the case of the hash-based star join, we devise a polynomial approximation algorithm which delivers a plan whose evaluation cost is $ O(n^{\epsilon }$) times the optimal, where n is the number of queries and is a fixed constant with . We also present an exponential algorithm which delivers a plan with the optimal evaluation cost. In the case of the index-based star join, we present a heuristic algorithm which delivers a plan whose evaluation cost is n times the optimal, and an exponential algorithm which delivers a plan with the optimal evaluation cost. We then consider a general case in which both hash-based star-join and index-based star-join queries are included. For this case, we give a possible improvement on the work of Zhao et al., based on an analysis of their solutions. We also develop another heuristic and an exact algorithm for the problem. We finally conduct a performance study by implementing our algorithms. The experimental results demonstrate that the solutions delivered for the restricted cases are always within two times of the optimal, which confirms our theoretical upper bounds. Actually these experiments produce much better results than our theoretical estimates. To the best of our knowledge, this is the only development of polynomial algorithms for the first two cases which are able to deliver plans with deterministic performance guarantees in terms of the qualities of the plans generated. The previous approaches including that of [ZDNS98] may generate a feasible plan for the problem in these two cases, but they do not provide any performance guarantee, i.e., the plans generated by their algorithms can be arbitrarily far from the optimal one. Received: July 21, 1998 / Accepted: August 26, 1999 相似文献

15.

Combining objects with rules to represent aggregation knowledge in data warehouse and OLAP systems

Nicolas PratAuthor Vitae Isabelle Comyn-Wattiau^{Author Vitae} 《Data & Knowledge Engineering》2011,70(8):732-752

Data warehouses are based on multidimensional modeling. Using On-Line Analytical Processing (OLAP) tools, decision makers navigate through and analyze multidimensional data. Typically, users need to analyze data at different aggregation levels (using roll-up and drill-down functions). Therefore, aggregation knowledge should be adequately represented in conceptual multidimensional models, and mapped in subsequent logical and physical models. However, current conceptual multidimensional models poorly represent aggregation knowledge, which (1) has a complex structure and dynamics and (2) is highly contextual. In order to account for the characteristics of this knowledge, we propose to represent it with objects (UML class diagrams) and rules in the Production Rule Representation language (PRR). Static aggregation knowledge is represented in the class diagrams, while rules represent the dynamics (i.e. how aggregation may be performed depending on context). We present the class diagrams, and a typology and examples of associated rules. We argue that this representation of aggregation knowledge enables an early modeling of user requirements in a data warehouse project. A prototype has been developed based on the Java Expert System Shell (Jess). 相似文献

16.

Stabilization for nonlinear systems via a limited capacity communication channel with data packet dropout

Lei ZHOU Guoping LU 《控制理论与应用(英文版)》2010,8(1):111-116

This paper addresses the stabilization problem for a class of nonlinear systems. It is assumed that the controller can only receive the transmitted sequence of finite coded signals via a limited digital communication channel. Both state and output feedback coder-decoder-controller procedures are proposed. Stabilization conditions involving the size of coding alphabet, the sampling period, system state growth rate and data packet dropout rate are obtained. Finally, an example is given to illustrate the design procedures and effectiveness of the proposed results. 相似文献