期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Reconciling requirement-driven data warehouses with data sources via multidimensional normal forms

Jose-Norberto Juan Jens 《Data & Knowledge Engineering》2007,63(3):725-751

Successful data warehouse (DW) design needs to be based upon a requirement analysis phase in order to adequately represent the information needs of DW users. Moreover, since the DW integrates the information provided by data sources, it is also crucial to take these sources into account throughout the development process to obtain a consistent reconciliation of data sources and information needs. In this paper, we start by summarizing our approach to specify user requirements for data warehouses and to obtain a conceptual multidimensional model capturing these requirements. Then, we make use of the multidimensional normal forms to define a set of Query/View/Transformation (QVT) relations to assure that the conceptual multidimensional model obtained from user requirements agrees with the available data sources that will populate the DW. Thus, we propose a hybrid approach to develop DWs, i.e., we firstly obtain the conceptual multidimensional model of the DW from user requirements and then we verify and enforce its correctness against data sources by using a set of QVT relations based on multidimensional normal forms. Finally, we provide some snapshots of the CASE tool we have used to implement our QVT relations. 相似文献

2.

A requirement-driven approach to the design and evolution of data warehouses

《Information Systems》2014

Designing data warehouse (DW) systems in highly dynamic enterprise environments is not an easy task. At each moment, the multidimensional (MD) schema needs to satisfy the set of information requirements posed by the business users. At the same time, the diversity and heterogeneity of the data sources need to be considered in order to properly retrieve needed data. Frequent arrival of new business needs requires that the system is adaptable to changes. To cope with such an inevitable complexity (both at the beginning of the design process and when potential evolution events occur), in this paper we present a semi-automatic method called ORE, for creating DW designs in an iterative fashion based on a given set of information requirements. Requirements are first considered separately. For each requirement, ORE expects the set of possible MD interpretations of the source data needed for that requirement (in a form similar to an MD schema). Incrementally, ORE builds the unified MD schema that satisfies the entire set of requirements and meet some predefined quality objectives. We have implemented ORE and performed a number of experiments to study our approach. We have also conducted a limited-scale case study to investigate its usefulness to designers. 相似文献

3.

Model-driven multidimensional modeling of secure data warehouses

Eduardo Fernández-Medina Juan Trujillo Mario Piattini 《欧洲信息系统杂志》2007,16(4):374-389

Data Warehouses (DW), Multidimensional (MD) databases, and On-Line Analytical Processing (OLAP) applications provide companies with many years of historical information for the decision-making process. Owing to the relevant information managed by these systems, they should provide strong security and confidentiality measures from the early stages of a DW project in the MD modeling and enforce them. In the last years, there have been some proposals to accomplish the MD modeling at the conceptual level. Nevertheless, none of them considers security measures as an important element in their models, and therefore, they do not allow us to specify confidentiality constraints to be enforced by the applications that will use these MD models. In this paper, we present an Access Control and Audit (ACA) model for the conceptual MD modeling. Then, we extend the Unified Modeling Language (UML) with this ACA model, representing the security information (gathered in the ACA model) in the conceptual MD modeling, thereby allowing us to obtain secure MD models. Moreover, we use the OSCL (Object Security Constraint Language) to specify our ACA model constraints, avoiding in this way an arbitrary use of them. Furthermore, we align our approach with the Model-Driven Architecture, the Model-Driven Security and the Model-Driven Data Warehouse, offering a proposal highly compatible with the more recent technologies. 相似文献

4.

A UML profile for multidimensional modeling in data warehouses

Sergio Juan Il-Yeol 《Data & Knowledge Engineering》2006,59(3):725-769

The multidimensional (MD) modeling, which is the foundation of data warehouses (DWs), MD databases, and On-Line Analytical Processing (OLAP) applications, is based on several properties different from those in traditional database modeling. In the past few years, there have been some proposals, providing their own formal and graphical notations, for representing the main MD properties at the conceptual level. However, unfortunately none of them has been accepted as a standard for conceptual MD modeling.

In this paper, we present an extension of the Unified Modeling Language (UML) using a UML profile. This profile is defined by a set of stereotypes, constraints and tagged values to elegantly represent main MD properties at the conceptual level. We make use of the Object Constraint Language (OCL) to specify the constraints attached to the defined stereotypes, thereby avoiding an arbitrary use of these stereotypes. We have based our proposal in UML for two main reasons: (i) UML is a well known standard modeling language known by most database designers, thereby designers can avoid learning a new notation, and (ii) UML can be easily extended so that it can be tailored for a specific domain with concrete peculiarities such as the multidimensional modeling for data warehouses. Moreover, our proposal is Model Driven Architecture (MDA) compliant and we use the Query View Transformation (QVT) approach for an automatic generation of the implementation in a target platform. Throughout the paper, we will describe how to easily accomplish the MD modeling of DWs at the conceptual level. Finally, we show how to use our extension in Rational Rose for MD modeling. 相似文献

5.

Comparing data type specifications via their normal forms

J. L. Remy P. A. S. Veloso 《International journal of parallel programming》1982,11(3):141-153

A simple technique is presented for verifying that two abstract data type specifications are equivalent in that they have isomorphic initial algebras. The method uses normal forms to attempt reducing the number of equations to be checked. It is applied to a simple example and some extensions, and related problems are also discussed. 相似文献

6.

Extending OCL for OLAP querying on conceptual multidimensional models of data warehouses

Jesús Pardillo Jose-Norberto Mazón 《Information Sciences》2010,180(5):584-5028

The development of data warehouses begins with the definition of multidimensional models at the conceptual level in order to structure data, which will facilitate decision makers with an easier data analysis. Current proposals for conceptual multidimensional modelling focus on the design of static data warehouse structures, but few approaches model the queries which the data warehouse should support by means of OLAP (on-line analytical processing) tools. OLAP queries are, therefore, only defined once the rest of the data warehouse has been implemented, which prevents designers from verifying from the very beginning of the development whether the decision maker will be able to obtain the required information from the data warehouse. This article presents a solution to this drawback consisting of an extension to the object constraint language (OCL), which has been developed to include a set of predefined OLAP operators. These operators can be used to define platform-independent OLAP queries as a part of the specification of the data warehouse conceptual multidimensional model. Furthermore, OLAP tools require the implementation of queries to assure performance optimisations based on pre-aggregation. It is interesting to note that the OLAP queries defined by our approach can be automatically implemented in the rest of the data warehouse, in a coherent and integrated manner. This implementation is supported by a code-generation architecture aligned with model-driven technologies, in particular the MDA (model-driven architecture) proposal. Finally, our proposal has been validated by means of a set of sample data sets from a well-known case study. 相似文献

7.

Designing data warehouses with OO conceptual models

Trujillo J. Palomar M. Gomez J. Il-Yeol Song 《Computer》2001,34(12):66-75

The authors propose an approach that provides a theoretical foundation for the use of object-oriented databases and object-relational databases in data warehouse, multidimensional database, and online analytical processing applications. This approach introduces a set of minimal constraints and extensions to the Unified Modeling Language for representing multidimensional modeling properties for these applications. Multidimensional modeling offers two benefits. First, the model closely parallels how data analyzers think and, therefore, helps users understand data. Second, multidimensional modeling helps predict what final users want to do, thereby facilitating performance improvements. The authors are using their approach to create an automatic implementation of a multidimensional model. They plan to integrate commercial online-analytical-processing tool facilities within their GOLD model case tool as well, a task that involves data warehouse prototyping and sample data generation issues 相似文献

8.

Developing secure data warehouses with a UML extension

Eduardo Fernández-Medina Juan Trujillo Rodolfo Villarroel Mario Piattini 《Information Systems》2007

Data Warehouses (DWs), Multidimensional (MD) Databases, and On-Line Analytical Processing Applications are used as a very powerful mechanism for discovering crucial business information. Considering the extreme importance of the information managed by these kinds of applications, it is essential to specify security measures from the early stages of the DW design in the MD modeling process, and enforce them. In the past years, some proposals for representing main MD modeling properties at the conceptual level have been stated. Nevertheless, none of these proposals considers security issues as an important element in its model, so they do not allow us to specify confidentiality constraints to be enforced by the applications that will use these MD models. In this paper, we will discuss the specific confidentiality problems regarding DWs as well as present an extension of the Unified Modeling Language for specifying security constraints in the conceptual MD modeling, thereby allowing us to design secure DWs. One key advantage of our approach is that we accomplish the conceptual modeling of secure DWs independently of the target platform where the DW has to be implemented, allowing the implementation of the corresponding DWs on any secure commercial database management system. Finally, we will present a case study to show how a conceptual model designed with our approach can be directly implemented on top of Oracle 10g. 相似文献

9.

Generalized normal forms for probabilistic relational data

Dey D. Sarkar S. 《Knowledge and Data Engineering, IEEE Transactions on》2002,14(3):485-497

Several approaches have been proposed for representing uncertain data in a database. These approaches have typically extended the relational model by incorporating probability measures to capture the uncertainty associated with data items. However, previous research has not directly addressed the issue of normalization for reducing data redundancy and data anomalies in probabilistic databases. We examine this issue. To that end, we generalize the concept of functional dependency to stochastic dependency and use that to extend the scope of normal forms to probabilistic databases. Our approach is a consistent extension of the conventional normalization theory and reduces to the latter 相似文献

10.

Active data warehouses: complementing OLAP with analysis rules 总被引：2，自引：0，他引：2

Thomas Michael Mukesh 《Data & Knowledge Engineering》2001,39(3):241-269

Conventional data warehouses are passive. All tasks related to analysing data and making decisions must be carried out manually by analysts. Today's data warehouse and OLAP systems offer little support to automatize decision tasks that occur frequently and for which well-established decision procedures are available. Such a functionality can be provided by extending the conventional data warehouse architecture with analysis rules, which mimic the work of an analyst during decision making. Analysis rules extend the basic event/condition/action (ECA) rule structure with mechanisms to analyse data multidimensionally and to make decisions. The resulting architecture is called active data warehouse. 相似文献

11.

Monotonic complements for independent data warehouses

D. Laurent J. Lechtenbörger N. Spyratos G. Vossen 《The VLDB Journal The International Journal on Very Large Data Bases》2001,10(4):295-315

Views over databases have regained attention in the context of data warehouses, which are seen as materialized views. In this setting, efficient view maintenance is an important issue, for which the notion of self-maintainability has been identified as desirable. In this paper, we extend the concept of self-maintainability to (query and update) independence within a formal framework, where independence with respect to arbitrary given sets of queries and updates over the sources can be guaranteed. To this end we establish an intuitively appealing connection between warehouse independence and view complements. Moreover, we study special kinds of complements, namely monotonic complements, and show how to compute minimal ones in the presence of keys and foreign keys in the underlying databases. Taking advantage of these complements, an algorithmic approach is proposed for the specification of independent warehouses with respect to given sets of queries and updates. Received: 21 November 2000 / Accepted: 1 May 2001 Published online: 6 September 2001 相似文献

12.

Quantum billiards in multidimensional models with fields of forms

V. D. Ivashchuk V. N. Melnikov 《Gravitation and Cosmology》2013,19(3):171-177

A Bianchi type I cosmological model in (n + 1)-dimensional gravity with several forms is considered. When the electric non-composite brane ansatz is adopted, the Wheeler-DeWitt (WDW) equation for the model, written in a conformally covariant form, is analyzed. Under certain restrictions, asymptotic solutions to the WDW equation near the singularity are found, which reduce the problem to the so-called quantum billiard on the (n ? 1)-dimensional Lobachevsky space ?^n?1. Two examples of quantum billiards are considered: a 2-dimensional quantum billiard for a 4D model with three 2-forms and a 9D quantum billiard for an 11D model with 120 4-forms, whichmimics the SM2-brane sector of D = 11 supergravity. For certain solutions, vanishing of the wave function at the singularity is proved. 相似文献

13.

结合多数据源预测蛋白质复合物

汤希玮李勇帆胡秋玲《计算机工程与应用》2012,48(27):105-108

蛋白质相互作用数据具有较高的假阳性率和假阴性率,这直接导致计算方法从中预测蛋白质复合物会产生较大的误差。为了弥补数据的这种先天性不足,通过结合多数据源,一种新的蛋白质复合物预测算法被提出。匹配分析和GO功能富集分析被用于评估算法的性能。测试结果表明,新算法远优于以前的其他算法。相似文献

14.

Fitting sparse multidimensional data with low-dimensional terms

Sergei Manzhos 《Computer Physics Communications》2009,180(10):2002-17955

An algorithm that fits a continuous function to sparse multidimensional data is presented. The algorithm uses a representation in terms of lower-dimensional component functions of coordinates defined in an automated way and also permits dimensionality reduction. Neural networks are used to construct the component functions.

Program summary

Program title: RS_HDMR_NNCatalogue identifier: AEEI_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEI_v1_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 19 566No. of bytes in distributed program, including test data, etc.: 327 856Distribution format: tar.gzProgramming language: MatLab R2007bComputer: any computer running MatLabOperating system: Windows XP, Windows Vista, UNIX, LinuxClassification: 4.9External routines: Neural Network Toolbox Version 5.1 (R2007b)Nature of problem: Fitting a smooth, easily integratable and differentiatable, function to a very sparse (∼2-3 points per dimension) multidimensional (D?6) large (∼¹⁰⁴-¹⁰⁵ data) dataset.Solution method: A multivariate function is represented as a sum of a small number of terms each of which is a low-dimensional function of optimised coordinates. The optimal coordinates reduce both the dimensionality and the number of the terms. Neural networks (including exponential neurons) are used to obtain a general and robust method and a functional form which is easily differentiated and integrated (in the case of exponential neurons).Running time: Depends strongly on the dataset to be modelled and the chosen structure of the approximating function, ranges from about a minute for ∼¹⁰³ data in 3-D to about a day for ∼¹⁰⁵ data in 15-D. 相似文献

15.

Binary equality implication constraints, normal forms and data redundancy

Junhu Wang 《Information Processing Letters》2007,101(1):20-25

We define binary equality implication constraints (BEICs) in relational databases and study the implication problem of these constraints, in particular, we provide a sound and complete set of inference rules for a common subset of BEICs. Two normal forms with respect to BEICs are defined and shown to be necessary and sufficient to prevent different types of data redundancies that may be caused by these constraints. 相似文献

16.

Fundamentals of data warehouses [Book Review]

《Software, IEEE》2001,18(5):92-95

相似文献

17.

大型数据仓库实现技术的研究 总被引：2，自引：0，他引：2

陈慧萍陈岚峰王建东《计算机工程与设计》2006,27(21):3956-3958,3961

大型数据仓库是实现海量数据存储的有效途径,但在大型数据仓库的实现中存在很多问题。在分析问题的基础上,对大型数据仓库的实现问题提出了一定的解决策略,对其中的几个关键技术即数据立方体的有效计算、增量式更新维护、索引优化、故障恢复、模式设计和查询优化的代价模型及元数据的定义和管理等作了研究。相似文献

18.

Tracking of multidimensional TDOA for multiple sources with distributed microphone pairs

Alessio Brutti Francesco Nesta 《Computer Speech and Language》2013,27(3):660-682

This paper presents a general framework for tracking the time differences of arrivals of multiple acoustic sources recorded by distributed microphone pairs. Tracking is based on a three-stage analysis. Complex-valued propagation models are extracted at different time instants and frequencies using either the independent component analysis or the phase of the cross-power spectrum evaluated at each microphone pair. In both cases, approximated densities of the propagation time delays are derived through the generalized state coherence transform. A sequential Bayesian tracking scheme with an integrated activity detection is finally implemented through disjoint particle filters based on a track-before-detect strategy. Experiments on both synthetic and real data recorded by two distributed microphone pairs show that the proposed framework can detect and track up to five sources simultaneously active in a reverberant environment. 相似文献

19.

A UML profile for the conceptual modelling of data-mining with time-series in data warehouses

Jose Zubcoff Jesús Pardillo Juan Trujillo 《Information and Software Technology》2009,51(6):977-992

Time-series analysis is a powerful technique to discover patterns and trends in temporal data. However, the lack of a conceptual model for this data-mining technique forces analysts to deal with unstructured data. These data are represented at a low-level of abstraction and their management is expensive. Most analysts face up to two main problems: (i) the cleansing of the huge amount of potentially-analysable data and (ii) the correct definition of the data-mining algorithms to be employed. Owing to the fact that analysts’ interests are also hidden in this scenario, it is not only difficult to prepare data, but also to discover which data is the most promising. Since their appearance, data warehouses have, therefore, proved to be a powerful repository of historical data for data-mining purposes. Moreover, their foundational modelling paradigm, such as, multidimensional modelling, is very similar to the problem domain. In this article, we propose a unified modelling language (UML) extension through UML profiles for data-mining. Specifically, the UML profile presented allows us to specify time-series analysis on top of the multidimensional models of data warehouses. Our extension provides analysts with an intuitive notation for time-series analysis which is independent of any specific data-mining tool or algorithm. In order to show its feasibility and ease of use, we apply it to the analysis of fish-captures in Alicante. We believe that a coherent conceptual modelling framework for data-mining assures a better and easier knowledge-discovery process on top of data warehouses. 相似文献

20.

Realizing active data warehouses with off‐the‐shelf database technology

Thomas Thalhammer Michael Schrefl 《Software》2002,32(12):1193-1222

Active data warehouses belong to a new category of decision support systems, which automate decision making for routine decision tasks and semi‐routine decision tasks. Just as active database systems extend conventional database systems with event–condition–action rules for integrity constraint enforcement or procedure execution, active data warehouses extend conventional data warehouses with analysis rules that mimic the work of an analyst during decision making. This paper demonstrates how analysis rules can be implemented on top of a passive relational data warehouse system by using commercially available database technology. Copyright © 2002 John Wiley & Sons, Ltd. 相似文献