首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Data Warehouses (DWs), Multidimensional (MD) Databases, and On-Line Analytical Processing Applications are used as a very powerful mechanism for discovering crucial business information. Considering the extreme importance of the information managed by these kinds of applications, it is essential to specify security measures from the early stages of the DW design in the MD modeling process, and enforce them. In the past years, some proposals for representing main MD modeling properties at the conceptual level have been stated. Nevertheless, none of these proposals considers security issues as an important element in its model, so they do not allow us to specify confidentiality constraints to be enforced by the applications that will use these MD models. In this paper, we will discuss the specific confidentiality problems regarding DWs as well as present an extension of the Unified Modeling Language for specifying security constraints in the conceptual MD modeling, thereby allowing us to design secure DWs. One key advantage of our approach is that we accomplish the conceptual modeling of secure DWs independently of the target platform where the DW has to be implemented, allowing the implementation of the corresponding DWs on any secure commercial database management system. Finally, we will present a case study to show how a conceptual model designed with our approach can be directly implemented on top of Oracle 10g.  相似文献   

Successful data warehouse (DW) design needs to be based upon a requirement analysis phase in order to adequately represent the information needs of DW users. Moreover, since the DW integrates the information provided by data sources, it is also crucial to take these sources into account throughout the development process to obtain a consistent reconciliation of data sources and information needs. In this paper, we start by summarizing our approach to specify user requirements for data warehouses and to obtain a conceptual multidimensional model capturing these requirements. Then, we make use of the multidimensional normal forms to define a set of Query/View/Transformation (QVT) relations to assure that the conceptual multidimensional model obtained from user requirements agrees with the available data sources that will populate the DW. Thus, we propose a hybrid approach to develop DWs, i.e., we firstly obtain the conceptual multidimensional model of the DW from user requirements and then we verify and enforce its correctness against data sources by using a set of QVT relations based on multidimensional normal forms. Finally, we provide some snapshots of the CASE tool we have used to implement our QVT relations.  相似文献   

The development of data warehouses begins with the definition of multidimensional models at the conceptual level in order to structure data, which will facilitate decision makers with an easier data analysis. Current proposals for conceptual multidimensional modelling focus on the design of static data warehouse structures, but few approaches model the queries which the data warehouse should support by means of OLAP (on-line analytical processing) tools. OLAP queries are, therefore, only defined once the rest of the data warehouse has been implemented, which prevents designers from verifying from the very beginning of the development whether the decision maker will be able to obtain the required information from the data warehouse. This article presents a solution to this drawback consisting of an extension to the object constraint language (OCL), which has been developed to include a set of predefined OLAP operators. These operators can be used to define platform-independent OLAP queries as a part of the specification of the data warehouse conceptual multidimensional model. Furthermore, OLAP tools require the implementation of queries to assure performance optimisations based on pre-aggregation. It is interesting to note that the OLAP queries defined by our approach can be automatically implemented in the rest of the data warehouse, in a coherent and integrated manner. This implementation is supported by a code-generation architecture aligned with model-driven technologies, in particular the MDA (model-driven architecture) proposal. Finally, our proposal has been validated by means of a set of sample data sets from a well-known case study.  相似文献   

Many data warehouses contain massive amounts of data, accumulated over long periods of time. In some cases, it is necessary or desirable to either delete “old” data or to maintain the data at an aggregate level. This may be due to privacy concerns, in which case the data are aggregated to levels that ensure anonymity. Another reason is the desire to maintain a balance between the uses of data that change as the data age and the size of the data, thus avoiding overly large data warehouses. This paper presents effective techniques for data reduction that enable the gradual aggregation of detailed data as the data ages. With these techniques, data may be aggregated to higher levels as they age, enabling the maintenance of more compact, consolidated data and the compliance with privacy requirements. Special care is taken to avoid semantic problems in the aggregation process. The paper also describes the querying of the resulting data warehouses and an implementation strategy based on current database technology.  相似文献   

为了解决软件开发中建模对于系统级关注点被忽略的问题,提出了面向方面的UML建模模型。对面向方面技术和UML进行全面介绍,使其了解它们的主要构成及参数,并基于AspectJ语言扩展UML中的特征文件(profile),来实现面向方面UML建模(AUML)。该扩展是UML体系的一种扩充,它既结合了UML面向对象的特点,又对面向方面横切关注点在语义和结构上进行了规范。最后就图书管理系统进行举例说明,总结了面向方面软件开发(AOSD)profile的参数。  相似文献   

Incremental maintenance of data warehouses has attracted a lot of research attention for the past few years. Nevertheless, most of the previous work is confined to the relational setting. Recently, object-oriented data warehouses have been regarded as a better means to integrate data from modern heterogeneous data sources. However, existing approaches to incremental maintenance of data warehouses do not directly apply to object-oriented data warehouses. In this paper, therefore, we propose an approach to incremental maintenance of object-oriented data warehouses. We focus on two primary issues specifically. First, we identify six categories of potential updates to an object-oriented view and propose an algorithm to find potential updates from the definition of the view. Second, we propose an incremental view maintenance algorithm for maintaining object-oriented data warehouses. We have implemented a prototype system for incremental maintenance of object-oriented data warehouses. Performance evaluation has been conducted, which indicates that our approach is correct and efficient.  相似文献   

This paper describes an approach for real-time modelling in UML, focusing on analysis and verification of time and scheduling-related properties. To this aim, a concrete UML profile, called the ωprofile, is defined, dedicated to real-time modelling by identifying a set of relevant concepts for real-time modelling which can be considered as a refinement of the standard SPT profile. The profile is based on a rich concept of event representing an instant of state change, and allows the expression of duration constraints between occurrences of events. These constraints can be provided in the form of OCL-like expressions annotating the specification or by means of state machines, stereotyped as ‘observers’. A framework for modelling scheduling issues is obtained by adding a notion of resource and a notion of execution time. For proving the relevance of these choices, the profile has been implemented in a validation tool and applied to case studies. It has a formal semantics and is sufficiently general and expressive to define a semantic underpinning for other real-time profiles of UML which in general define more restricted frameworks. In particular, most existing profiles handling real-time issues define a number of predefined attributes representing particular durations or constraints on them and their semantic interpretation can be expressed in the OMEGA-RT profile. This work has been partially supported by the IST-2002-33522 OMEGA project. VERIMAG is an academic research laboratory associated with CNRS, Université Joseph Fourier and Institut Nationale Polytechnique de Grenoble.  相似文献   

Multidimensional data modeling for location-based services   总被引:5,自引:0,他引:5  
With the recent and continuing advances in areas such as wireless communications and positioning technologies, mobile, location-based services are becoming possible.Such services deliver location-dependent content to their users. More specifically, these services may capture the movements and requests of their users in multidimensional databases, i.e., data warehouses, and content delivery may be based on the results of complex queries on these data warehouses. Such queries aggregate detailed data in order to find useful patterns, e.g., in the interaction of a particular user with the services.The application of multidimensional technology in this context poses a range of new challenges. The specific challenge addressed here concerns the provision of an appropriate multidimensional data model. In particular, the paper extends an existing multidimensional data model and algebraic query language to accommodate spatial values that exhibit partial containment relationships instead of the total containment relationships normally assumed in multidimensional data models. Partial containment introduces imprecision in aggregation paths. The paper proposes a method for evaluating the imprecision of such paths. The paper also offers transformations of dimension hierarchies with partial containment relationships to simple hierarchies, to which existing precomputation techniques are applicable.Received: 28 September 2002, Accepted: 5 April 2003, Published online: 12 August 2003Edited by: J. Veijalainen Correspondence to: I. Timko  相似文献   

模糊性广泛存在于时空应用领域,现有的时空数据模型缺乏描述和表达模糊时空对象内在机制和语义关系的能力。通过研究模糊时空数据语义,给出了模糊时空数据模型的形式化定义,在此基础上对UML类图进行扩展,提出一种模糊时空UML数据模型,并用例子说明本文所提模糊时空数据模型的可用性。  相似文献   

The success rate of data warehouse (DW) development is improved by performing a requirements elicitation stage in which the users’ needs are modeled. Currently, among the different proposals for modeling requirements, there is a special focus on goal-oriented models, and in particular on the i* framework. In order to adapt this framework for DW development, we previously developed a UML profile for DWs. However, as the general i* framework, the proposal lacks modularity. This has a specially negative impact for DW development, since DW requirement models tend to include a huge number of elements with crossed relationships between them. In turn, the readability of the models is decreased, harming their utility and increasing the error rate and development time. In this paper, we propose an extension of our i* profile for DWs considering the modularization of goals. We provide a set of guidelines in order to correctly apply our proposal. Furthermore, we have performed an experiment in order to assess the validity our proposal. The benefits of our proposal are an increase in the modularity and scalability of the models which, in turn, increases the error correction capability, and makes complex models easier to understand by DW developers and non expert users.  相似文献   

扩展UML活动图在工作流建模中的应用*   总被引:1,自引:0,他引:1  
针对UML对数据和信息流描述缺乏充分表达业务工作流程的问题,借助新创建的活动图,提出了基于扩展UML活动图的工作流过程建模方法。应用实例表明,扩展的UML活动图对工作流程的语义表达更丰富,更准确地描述工作流建模所需表达的内容,以满足工作流过程建模的要求。  相似文献   

In this paper we address the problem of integrating independent and possibly heterogeneous data warehouses, a problem that has received little attention so far, but that arises very often in practice. We start by tackling the basic issue of matching heterogeneous dimensions and provide a number of general properties that a dimension matching should fulfill. We then propose two different approaches to the problem of integration that try to enforce matchings satisfying these properties. The first approach refers to a scenario of loosely coupled integration, in which we just need to identify the common information between data sources and perform join operations over the original sources. The goal of the second approach is the derivation of a materialized view built by merging the sources, and refers to a scenario of tightly coupled integration in which queries are performed against the view. We also illustrate architecture and functionality of a practical system that we have developed to demonstrate the effectiveness of our integration strategies. A preliminary version this paper appeared, under the title “Integrating Heterogeneous Multidimensional Databases” [9], in 17th Int. Conference on Scientific and Statistical Database Management, 2005.  相似文献   

Materialized views and indexes are physical structures for accelerating data access that are casually used in data warehouses. However, these data structures generate some maintenance overhead. They also share the same storage space. Most existing studies about materialized view and index selection consider these structures separately. In this paper, we adopt the opposite stance and couple materialized view and index selection to take view–index interactions into account and achieve efficient storage space sharing. Candidate materialized views and indexes are selected through a data mining process. We also exploit cost models that evaluate the respective benefit of indexing and view materialization, and help select a relevant configuration of indexes and materialized views among the candidates. Experimental results show that our strategy performs better than an independent selection of materialized views and indexes.  相似文献   

Multidimensional data are exploited in many application areas such as scientific data analysis, business intelligence, and geographic information systems. One of the most frequent operations applied to such multidimensional data is the selection of a subspace of the given multidimensional space, which involves predicate evaluation on multiple dimensions. Existing main-memory data layouts optimized for evaluating predicates on the columnar data can be used to accelerate the subspace extraction by sequentially performing filter scans on each dimension one at a time. However, optimization opportunities emerge if we can consider all predicates together. In this paper, we propose DimensionSlice, a new main-memory data layout optimized for evaluating predicates on multiple dimensions. More specifically, the dimension values are sliced into portions and the portions with the same order of each dimension are arranged together. Multiple predicates are simultaneously evaluated with the sliced dimension values during the scan. In addition, by storing the different portions separately, unnecessary loads and computations of lower portions can be eliminated if the evaluation results are assured after examining the upper portions. For further acceleration of scans, the DimensionSlice layout is designed to easily leverage the SIMD capabilities that most mainstream processors are equipped with. Through experiments, we demonstrate the performance gains of the proposed method over the columnar main-memory layout that evaluates the partial predicates one dimension at a time. We also show that the proposed method outperforms the state-of-the-art multidimensional index structure when the selectivity is over a very low threshold.  相似文献   

Time-series analysis is a powerful technique to discover patterns and trends in temporal data. However, the lack of a conceptual model for this data-mining technique forces analysts to deal with unstructured data. These data are represented at a low-level of abstraction and their management is expensive. Most analysts face up to two main problems: (i) the cleansing of the huge amount of potentially-analysable data and (ii) the correct definition of the data-mining algorithms to be employed. Owing to the fact that analysts’ interests are also hidden in this scenario, it is not only difficult to prepare data, but also to discover which data is the most promising. Since their appearance, data warehouses have, therefore, proved to be a powerful repository of historical data for data-mining purposes. Moreover, their foundational modelling paradigm, such as, multidimensional modelling, is very similar to the problem domain. In this article, we propose a unified modelling language (UML) extension through UML profiles for data-mining. Specifically, the UML profile presented allows us to specify time-series analysis on top of the multidimensional models of data warehouses. Our extension provides analysts with an intuitive notation for time-series analysis which is independent of any specific data-mining tool or algorithm. In order to show its feasibility and ease of use, we apply it to the analysis of fish-captures in Alicante. We believe that a coherent conceptual modelling framework for data-mining assures a better and easier knowledge-discovery process on top of data warehouses.  相似文献   

王震  蒋哲远 《计算机应用》2017,37(7):2027-2033
针对当前商业环境中传统企业资源计划(ERP)系统的低开放性、低拓展性和高成本等问题,提出了一种基于软件即服务(SaaS)模式的ERP系统建模方法。首先,利用UML的拓展机制,对原语扩充,得到新的原语集UML profile;其次,建立等效元模型,通过对象约束语言(OCL)保证语义的无二义性;最后,通过应用图、操作字典、物理图和拓扑图组成的模型框架对云ERP系统进行描述,实现云ERP系统的文档化。该方法专注于模块化设计,所有阶段均采用统一的可视化元模型。根据建模需求,在企业架构(EA)平台上采用所提方法成功建立了基于SaaS的云ERP模型,验证了所提建模方法的有效性。理论分析及建模结果表明,该方法确保了模型间的互操作性和一致性,提高了ERP系统的可成长性。  相似文献   

An important aspect in the specification of conceptual schemas is the definition of general constraints that cannot be expressed by the predefined constructs provided by conceptual modeling languages. This is generally achieved by using general-purpose languages like OCL. In this paper we propose a new approach that facilitates the definition of such general constraints in UML. More precisely, we define a profile that extends the set of predefined UML constraints by adding certain types of constraints that are commonly used in conceptual schemas. We also show how our proposal facilitates reasoning about the constraints and their automatic code generation, study the application of our ideas to the specification of two real-life applications, and present a prototype tool implementation.
Ernest TenienteEmail:

Analysis of historical data in data warehouses contributes significantly toward future decision-making. A number of design factors including, slowly changing dimensions (SCDs), affect the quality of such analysis. In SCDs, attribute values may change over time and must be tracked. They should maintain consistency and correctness of data, and show good query performance. We identify that SCDs can have three types of validity periods: disjoint, overlapping, and same validity periods. We then show that the third type cannot be handled through the temporal star schema for temporal data warehouses (TDWs). We further show that a hybrid/Type6 scheme and temporal star schema may be used to handle this shortcoming. We demonstrate that the use of a surrogate key in the hybrid scheme efficiently identifies data, avoids most time comparisons, and improves query performance. Finally, we compare the TDWs and a surrogate key-based temporal data warehouse (SKTDW) using query formulation, query performance, and data warehouse size as parameters. The results of our experiments for 23 queries of five different types show that SKTDW outperforms TDW for all type of queries, with average and maximum performance improvements of 165% and 1071%, respectively. The results of our experiments are statistically significant.  相似文献   

ContextDomains where data have a complex structure requiring new approaches for knowledge discovery from data are on the increase. In such domains, the information related to each object under analysis may be composed of a very broad set of interrelated data instead of being represented by a simple attribute table. This further complicates their analysis.ObjectiveIt is becoming more and more necessary to model data before analysis in order to assure that they are properly understood, stored and later processed. On this ground, we have proposed a UML extension that is able to represent any set of structurally complex hierarchically ordered data. Conceptually modelled data are human comprehensible and constitute the starting point for automating other data analysis tasks, such as comparing items or generating reference models.MethodThe proposed notation has been applied to structurally complex data from the stabilometry field. Stabilometry is a medical discipline concerned with human balance. We have organized the model data through an implementation based on XML syntax.ResultsWe have applied data mining techniques to the resulting structured data for knowledge discovery. The sound results of modelling a domain with such complex and wide-ranging data confirm the utility of the approach.ConclusionThe conceptual modelling and the analysis of non-conventional data are important challenges. We have proposed a UML profile that has been tested on data from a medical domain, obtaining very satisfactory results. The notation is useful for understanding domain data and automating knowledge discovery tasks.  相似文献   

领域知识的表示及UML建模   总被引:4,自引:4,他引:0  
领域模型是领域知识的一种图形化表示形式,是领域知识各组成部分的抽象。通过对5种领域模型的讨论,展示了不同角度和不同层次的领域知识,这些不同层次的领域知识对于认识领域内的各相关系统的特征和行为是十分有用的。此外,通过对领域知识的UML建模,提高了领域内系统的软件重用层次,以便在面向对象的软件开发过程中获得最佳的软件重用时机。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号