期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A UML profile for multidimensional modeling in data warehouses

Sergio Juan Il-Yeol 《Data & Knowledge Engineering》2006,59(3):725-769

The multidimensional (MD) modeling, which is the foundation of data warehouses (DWs), MD databases, and On-Line Analytical Processing (OLAP) applications, is based on several properties different from those in traditional database modeling. In the past few years, there have been some proposals, providing their own formal and graphical notations, for representing the main MD properties at the conceptual level. However, unfortunately none of them has been accepted as a standard for conceptual MD modeling.

In this paper, we present an extension of the Unified Modeling Language (UML) using a UML profile. This profile is defined by a set of stereotypes, constraints and tagged values to elegantly represent main MD properties at the conceptual level. We make use of the Object Constraint Language (OCL) to specify the constraints attached to the defined stereotypes, thereby avoiding an arbitrary use of these stereotypes. We have based our proposal in UML for two main reasons: (i) UML is a well known standard modeling language known by most database designers, thereby designers can avoid learning a new notation, and (ii) UML can be easily extended so that it can be tailored for a specific domain with concrete peculiarities such as the multidimensional modeling for data warehouses. Moreover, our proposal is Model Driven Architecture (MDA) compliant and we use the Query View Transformation (QVT) approach for an automatic generation of the implementation in a target platform. Throughout the paper, we will describe how to easily accomplish the MD modeling of DWs at the conceptual level. Finally, we show how to use our extension in Rational Rose for MD modeling. 相似文献

2.

Specification-based data reduction in dimensional data warehouses

Janne Skyt Christian S. Jensen Torben Bach Pedersen 《Information Systems》2008

Many data warehouses contain massive amounts of data, accumulated over long periods of time. In some cases, it is necessary or desirable to either delete “old” data or to maintain the data at an aggregate level. This may be due to privacy concerns, in which case the data are aggregated to levels that ensure anonymity. Another reason is the desire to maintain a balance between the uses of data that change as the data age and the size of the data, thus avoiding overly large data warehouses. This paper presents effective techniques for data reduction that enable the gradual aggregation of detailed data as the data ages. With these techniques, data may be aggregated to higher levels as they age, enabling the maintenance of more compact, consolidated data and the compliance with privacy requirements. Special care is taken to avoid semantic problems in the aggregation process. The paper also describes the querying of the resulting data warehouses and an implementation strategy based on current database technology. 相似文献

3.

Incremental maintenance of object-oriented data warehouses

Ching-Ming Chao 《Information Sciences》2004,160(1-4):91-110

Incremental maintenance of data warehouses has attracted a lot of research attention for the past few years. Nevertheless, most of the previous work is confined to the relational setting. Recently, object-oriented data warehouses have been regarded as a better means to integrate data from modern heterogeneous data sources. However, existing approaches to incremental maintenance of data warehouses do not directly apply to object-oriented data warehouses. In this paper, therefore, we propose an approach to incremental maintenance of object-oriented data warehouses. We focus on two primary issues specifically. First, we identify six categories of potential updates to an object-oriented view and propose an algorithm to find potential updates from the definition of the view. Second, we propose an incremental view maintenance algorithm for maintaining object-oriented data warehouses. We have implemented a prototype system for incremental maintenance of object-oriented data warehouses. Performance evaluation has been conducted, which indicates that our approach is correct and efficient. 相似文献

4.

Reconciling requirement-driven data warehouses with data sources via multidimensional normal forms

Jose-Norberto Juan Jens 《Data & Knowledge Engineering》2007,63(3):725-751

Successful data warehouse (DW) design needs to be based upon a requirement analysis phase in order to adequately represent the information needs of DW users. Moreover, since the DW integrates the information provided by data sources, it is also crucial to take these sources into account throughout the development process to obtain a consistent reconciliation of data sources and information needs. In this paper, we start by summarizing our approach to specify user requirements for data warehouses and to obtain a conceptual multidimensional model capturing these requirements. Then, we make use of the multidimensional normal forms to define a set of Query/View/Transformation (QVT) relations to assure that the conceptual multidimensional model obtained from user requirements agrees with the available data sources that will populate the DW. Thus, we propose a hybrid approach to develop DWs, i.e., we firstly obtain the conceptual multidimensional model of the DW from user requirements and then we verify and enforce its correctness against data sources by using a set of QVT relations based on multidimensional normal forms. Finally, we provide some snapshots of the CASE tool we have used to implement our QVT relations. 相似文献

5.

基于UML类图的模糊时空数据建模

陈旭严丽马宗民李卫军《计算机应用研究》2019,36(2)

模糊性广泛存在于时空应用领域,现有的时空数据模型缺乏描述和表达模糊时空对象内在机制和语义关系的能力。通过研究模糊时空数据语义,给出了模糊时空数据模型的形式化定义,在此基础上对UML类图进行扩展,提出一种模糊时空UML数据模型,并用例子说明本文所提模糊时空数据模型的可用性。相似文献

6.

Adding semantic modules to improve goal-oriented analysis of data warehouses using I-star

《Journal of Systems and Software》2014

The success rate of data warehouse (DW) development is improved by performing a requirements elicitation stage in which the users’ needs are modeled. Currently, among the different proposals for modeling requirements, there is a special focus on goal-oriented models, and in particular on the i* framework. In order to adapt this framework for DW development, we previously developed a UML profile for DWs. However, as the general i* framework, the proposal lacks modularity. This has a specially negative impact for DW development, since DW requirement models tend to include a huge number of elements with crossed relationships between them. In turn, the readability of the models is decreased, harming their utility and increasing the error rate and development time. In this paper, we propose an extension of our i* profile for DWs considering the modularization of goals. We provide a set of guidelines in order to correctly apply our proposal. Furthermore, we have performed an experiment in order to assess the validity our proposal. The benefits of our proposal are an increase in the modularity and scalability of the models which, in turn, increases the error correction capability, and makes complex models easier to understand by DW developers and non expert users. 相似文献

7.

Handling slowly changing dimensions in data warehouses

《Journal of Systems and Software》2014

Analysis of historical data in data warehouses contributes significantly toward future decision-making. A number of design factors including, slowly changing dimensions (SCDs), affect the quality of such analysis. In SCDs, attribute values may change over time and must be tracked. They should maintain consistency and correctness of data, and show good query performance. We identify that SCDs can have three types of validity periods: disjoint, overlapping, and same validity periods. We then show that the third type cannot be handled through the temporal star schema for temporal data warehouses (TDWs). We further show that a hybrid/Type6 scheme and temporal star schema may be used to handle this shortcoming. We demonstrate that the use of a surrogate key in the hybrid scheme efficiently identifies data, avoids most time comparisons, and improves query performance. Finally, we compare the TDWs and a surrogate key-based temporal data warehouse (SKTDW) using query formulation, query performance, and data warehouse size as parameters. The results of our experiments for 23 queries of five different types show that SKTDW outperforms TDW for all type of queries, with average and maximum performance improvements of 165% and 1071%, respectively. The results of our experiments are statistically significant. 相似文献

8.

Two approaches to the integration of heterogeneous data warehouses

Riccardo Torlone 《Distributed and Parallel Databases》2008,23(1):69-97

In this paper we address the problem of integrating independent and possibly heterogeneous data warehouses, a problem that has received little attention so far, but that arises very often in practice. We start by tackling the basic issue of matching heterogeneous dimensions and provide a number of general properties that a dimension matching should fulfill. We then propose two different approaches to the problem of integration that try to enforce matchings satisfying these properties. The first approach refers to a scenario of loosely coupled integration, in which we just need to identify the common information between data sources and perform join operations over the original sources. The goal of the second approach is the derivation of a materialized view built by merging the sources, and refers to a scenario of tightly coupled integration in which queries are performed against the view. We also illustrate architecture and functionality of a practical system that we have developed to demonstrate the effectiveness of our integration strategies. A preliminary version this paper appeared, under the title “Integrating Heterogeneous Multidimensional Databases” [9], in 17th Int. Conference on Scientific and Statistical Database Management, 2005. 相似文献

9.

Data mining-based materialized view and index selection in data warehouses

Kamel Aouiche Jérôme Darmont 《Journal of Intelligent Information Systems》2009,33(1):65-93

Materialized views and indexes are physical structures for accelerating data access that are casually used in data warehouses. However, these data structures generate some maintenance overhead. They also share the same storage space. Most existing studies about materialized view and index selection consider these structures separately. In this paper, we adopt the opposite stance and couple materialized view and index selection to take view–index interactions into account and achieve efficient storage space sharing. Candidate materialized views and indexes are selected through a data mining process. We also exploit cost models that evaluate the respective benefit of indexing and view materialization, and help select a relevant configuration of indexes and materialized views among the candidates. Experimental results show that our strategy performs better than an independent selection of materialized views and indexes. 相似文献

10.

Modelling and querying geographical data warehouses

Joel da Silva Anjolina G. de Oliveira Robson N. Fidalgo Ana Carolina Salgado Valéria C. Times 《Information Systems》2010

A number of proposals for integrating geographical (Geographical Information Systems—GIS) and multidimensional (data warehouse—DW and online analytical processing—OLAP) processing are found in the database literature. However, most of the current approaches do not take into account the use of a GDW (geographical data warehouse) metamodel or query language to make available the simultaneous specification of multidimensional and spatial operators. To address this, this paper discusses the UML class diagram of a GDW metamodel and proposes its formal specifications. We then present a formal metamodel for a geographical data cube and propose the Geographical Multidimensional Query Language (GeoMDQL) as well. GeoMDQL is based on well-known standards such as the MultiDimensional eXpressions (MDX) language and OGC simple features specification for SQL and has been specifically defined for spatial OLAP environments based on a GDW. We also present the GeoMDQL syntax and a discussion regarding the taxonomy of GeoMDQL query types. Additionally, aspects related to the GeoMDQL architecture implementation are described, along with a case study involving the Brazilian public healthcare system in order to illustrate the proposed query language. 相似文献

11.

Multidimensional data modeling for location-based services 总被引：5，自引：0，他引：5

Christian?S.?Jensen Email author Augustas?Kligys Torben?Bach?Pedersen Igor?Timko 《The VLDB Journal The International Journal on Very Large Data Bases》2004,13(1):1-21

With the recent and continuing advances in areas such as wireless communications and positioning technologies, mobile, location-based services are becoming possible.Such services deliver location-dependent content to their users. More specifically, these services may capture the movements and requests of their users in multidimensional databases, i.e., data warehouses, and content delivery may be based on the results of complex queries on these data warehouses. Such queries aggregate detailed data in order to find useful patterns, e.g., in the interaction of a particular user with the services.The application of multidimensional technology in this context poses a range of new challenges. The specific challenge addressed here concerns the provision of an appropriate multidimensional data model. In particular, the paper extends an existing multidimensional data model and algebraic query language to accommodate spatial values that exhibit partial containment relationships instead of the total containment relationships normally assumed in multidimensional data models. Partial containment introduces imprecision in aggregation paths. The paper proposes a method for evaluating the imprecision of such paths. The paper also offers transformations of dimension hierarchies with partial containment relationships to simple hierarchies, to which existing precomputation techniques are applicable.Received: 28 September 2002, Accepted: 5 April 2003, Published online: 12 August 2003Edited by: J. Veijalainen Correspondence to: I. Timko 相似文献

12.

扩展UML建模环境中对象关系的模型转换

张琳陈操宇《微机发展》2005,15(2):38-40

大量的应用程序采用面向对象的结构,并需要在一个持久化的存储机构———关系数据库中存放和取回数据。面向对象和关系数据库这两种技术存在着阻抗不匹配,使用UML对两种模型进行转换可以减少这种不匹配。但UML对两者的模型转换存在着不足,文中提出了在UML建模环境中引入扩展对象图描述对象和关系的模型转换,简化了模型转换的步骤,使得模型转换的自动化程度更高。相似文献

13.

Extending OCL for OLAP querying on conceptual multidimensional models of data warehouses

Jesús Pardillo Jose-Norberto Mazón 《Information Sciences》2010,180(5):584-5028

The development of data warehouses begins with the definition of multidimensional models at the conceptual level in order to structure data, which will facilitate decision makers with an easier data analysis. Current proposals for conceptual multidimensional modelling focus on the design of static data warehouse structures, but few approaches model the queries which the data warehouse should support by means of OLAP (on-line analytical processing) tools. OLAP queries are, therefore, only defined once the rest of the data warehouse has been implemented, which prevents designers from verifying from the very beginning of the development whether the decision maker will be able to obtain the required information from the data warehouse. This article presents a solution to this drawback consisting of an extension to the object constraint language (OCL), which has been developed to include a set of predefined OLAP operators. These operators can be used to define platform-independent OLAP queries as a part of the specification of the data warehouse conceptual multidimensional model. Furthermore, OLAP tools require the implementation of queries to assure performance optimisations based on pre-aggregation. It is interesting to note that the OLAP queries defined by our approach can be automatically implemented in the rest of the data warehouse, in a coherent and integrated manner. This implementation is supported by a code-generation architecture aligned with model-driven technologies, in particular the MDA (model-driven architecture) proposal. Finally, our proposal has been validated by means of a set of sample data sets from a well-known case study. 相似文献

14.

Cryptographic techniques of strategic data splitting and secure information management

《Pervasive and Mobile Computing》2016

This publication presents techniques for classifying strategic information, namely financial figures which make it possible to determine the standing of an enterprise or an organisation. These techniques of classifying (hiding) strategic information will be presented based on their application to problems of securely storing data of special significance, i.e. cryptographic information sharing protocols. What will be innovative will be the use of cryptographic information sharing protocols in cognitive systems for data analysis. This class of systems will be discussed based on systems for the semantic analysis of ratio data used to analyse liquidity indicators. 相似文献

15.

利用UML开发企业电子结算系统 总被引：1，自引：0，他引：1

陈军《计算机与现代化》2004,(5):38-40

将UML应用于电子结算信息系统的建设,通过静态和动态系统分析,以及多角度的系统描述,能加速开发进程,方便地集成现有信息系统资源。相似文献

16.

Secure computation with horizontally partitioned data using adaptive regression splines

Joyee Ghosh Alan F. Karr 《Computational statistics & data analysis》2007,51(12):5813-5820

When several data owners possess data on different records but the same variables, known as horizontally partitioned data, the owners can improve statistical inferences by sharing their data with each other. Often, however, the owners are unwilling or unable to share because the data are confidential or proprietary. Secure computation protocols enable the owners to compute parameter estimates for some statistical models, including linear regressions, without sharing individual records’ data. A drawback to these techniques is that the model must be specified in advance of initiating the protocol, and the usual exploratory strategies for determining good-fitting models have limited usefulness since the individual records are not shared. In this paper, we present a protocol for secure adaptive regression splines that allows for flexible, semi-automatic regression modeling. This reduces the risk of model mis-specification inherent in secure computation settings. We illustrate the protocol with air pollution data. 相似文献

17.

Converting XML DTDs to UML diagrams for conceptual data integration 总被引：2，自引：0，他引：2

Mikael R. Jensen Thomas H. Mller Torben Bach Pedersen 《Data & Knowledge Engineering》2003,44(3):313-346

Extensible Markup Language (XML) is fast becoming the new standard for data representation and exchange on the World Wide Web, e.g., in B2B e-commerce. Modern enterprises need to combine data from many sources in order to answer important business questions, creating a need for integration of web-based XML data. Previous web-based data integration efforts have focused almost exclusively on the logical level of data models, creating a need for techniques that focus on the conceptual level in order to communicate the structure and properties of the available data to users at a higher level of abstraction. The most widely used conceptual model at the moment is the Unified Modeling Language (UML).

This paper presents algorithms for automatically constructing UML diagrams from XML DTDs, enabling fast and easy graphical browsing of XML data sources on the web. The algorithms capture important semantic properties of the XML data such as precise cardinalities and aggregation (containment) relationships between the data elements. As a motivating application, it is shown how the generated diagrams can be used for the conceptual design of data warehouses based on web data, and an integration architecture is presented. The choice of data warehouses and On-Line Analytical Processing as the motivating application is another distinguishing feature of the presented approach. 相似文献

18.

扩展UML用于面向方面的建模 总被引：3，自引：0，他引：3

曾路张立臣《微机发展》2004,14(12):106-107,110

面向方面编程(AOP)向用户提供了把贯穿特性模块化和编排的能力，以便获得最大的代码重用以及解决代码混乱的问题。然而，目前还没有合适的对AOP的建模语言。文中讨论了一种扩展UML用于面向方面的建模的方法，并通过一个例子来说明这种方法的应用。相似文献

19.

利用BISON ++设计UML规约扫描器

张保卫张毅坤赵明《计算机应用》2004,24(1):123-125

在软件开发过程中各种分析设计规约文档是软件测试的重要依据。文中利用BISON 设计一个UML文本扫描器,实现了在UML文档中自动提取有助于软件测试的各种分析设计信息,提高了软件测试分析设计的效率。相似文献

20.

电网分析仪监控系统的UML建模

熊华刘凤新霍亮《计算机工程与设计》2004,25(9):1540-1542

以电网分析仪监控系统为背景,探讨了基于UML的面向对象系统建模方法,并与传统系统建模方法进行了比较,展示了UML在获取系统功能需求、增强系统可视性和可维护性等方面具有的优点。相似文献