首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Due to an explosive increase of XML documents, it is imperative to manage XML data in an XML data warehouse. XML warehousing imposes challenges, which are not found in the relational data warehouses. In this paper, we firstly present a framework to build an XML data warehouse schema. For the purpose of scalability due to the increase of data volume, we propose a number of partitioning techniques for multi-version XML data warehouses, including document based partitioning, schema based partitioning, and cascaded (mixed) partitioning model. Finally, we formulate cost models to evaluate various types of queries for an XML data warehouse.  相似文献   

2.
Successful data warehouse (DW) design needs to be based upon a requirement analysis phase in order to adequately represent the information needs of DW users. Moreover, since the DW integrates the information provided by data sources, it is also crucial to take these sources into account throughout the development process to obtain a consistent reconciliation of data sources and information needs. In this paper, we start by summarizing our approach to specify user requirements for data warehouses and to obtain a conceptual multidimensional model capturing these requirements. Then, we make use of the multidimensional normal forms to define a set of Query/View/Transformation (QVT) relations to assure that the conceptual multidimensional model obtained from user requirements agrees with the available data sources that will populate the DW. Thus, we propose a hybrid approach to develop DWs, i.e., we firstly obtain the conceptual multidimensional model of the DW from user requirements and then we verify and enforce its correctness against data sources by using a set of QVT relations based on multidimensional normal forms. Finally, we provide some snapshots of the CASE tool we have used to implement our QVT relations.  相似文献   

3.
Successful data warehouse (DW) design needs to be based upon a requirement analysis phase in order to adequately represent the information needs of DW users. Moreover, since the DW integrates the information provided by data sources, it is also crucial to take these sources into account throughout the development process to obtain a consistent reconciliation of data sources and information needs. In this paper, we start by summarizing our approach to specify user requirements for data warehouses and to obtain a conceptual multidimensional model capturing these requirements. Then, we make use of the multidimensional normal forms to define a set of Query/View/Transformation (QVT) relations to assure that the conceptual multidimensional model obtained from user requirements agrees with the available data sources that will populate the DW. Thus, we propose a hybrid approach to develop DWs, i.e., we firstly obtain the conceptual multidimensional model of the DW from user requirements and then we verify and enforce its correctness against data sources by using a set of QVT relations based on multidimensional normal forms. Finally, we provide some snapshots of the CASE tool we have used to implement our QVT relations.  相似文献   

4.
Developing a data warehouse is an ongoing task where new requirements are constantly being added. A widely accepted approach for developing data warehouses is the hybrid approach, where requirements and data sources must be accommodated to a reconciliated data warehouse model. During this process, relationships between conceptual elements specified by user requirements and those supplied by the data sources are lost, since no traceability mechanisms are included. As a result, the designer wastes additional time and effort to update the data warehouse whenever user requirements or data sources change. In this paper, we propose an approach to preserve traceability at conceptual level for data warehouses. Our approach includes a set of traces and their formalization, in order to relate the multidimensional elements specified by user requirements with the concepts extracted from data sources. Therefore, we can easily identify how changes should be incorporated into the data warehouse, and derive it according to the new configuration. In order to minimize the effort required, we define a set of general Query/View/Transformation rules to automate the derivation of traces along with data warehouse elements. Finally, we describe a CASE tool that supports our approach and provide a detailed case study to show the applicability of the proposal.  相似文献   

5.
ContextDecision makers query enterprise information stored in Data Warehouses (DW) by using tools (such as On-Line Analytical Processing (OLAP) tools) which use specific views or cubes from the corporate DW or Data Marts, based on the multidimensional modeling. Since the information managed is critical, security constraints have to be correctly established in order to avoid unauthorized accesses.ObjectiveIn previous work we have defined a Model-Driven based approach for developing a secure DWs repository by following a relational approach. Nevertheless, is also important to define security constraints in the metadata layer that connects the DWs repository with the OLAP tools, that is, over the same multidimensional structures that final users manage. This paper defines a proposal to develop secure OLAP applications and incorporates it into our previous approach.MethodOur proposal is composed of models and transformations. Our models have been defined using the extension capabilities from UML (conceptual model) and extending the OLAP package of CWM with security (logical model). Transformations have been defined by using a graphical notation and implemented into QVT and MOFScript. Finally, this proposal has been evaluated through case studies.ResultsA complete MDA architecture for developing secure OLAP applications. The main contributions of this paper are: improvement of a UML profile for conceptual modeling; definition of a logical metamodel for OLAP applications; and definition and implementation of transformations from conceptual to logical models, and from logical models to the secure implementation into a specific OLAP tool (SSAS).ConclusionOur proposal allows us to develop secure OLAP applications, providing a complete MDA architecture composed of several security models and automatic transformations towards the final secure implementation. Security aspects are early identified and fitted into a most robust solution that provides us a better information assurance and a saving of time in maintenance.  相似文献   

6.
ABSTRACT

Data warehouses (DW) are a key component of business intelligence and decision-making. In this paper, we present an approach that combines Grounded Theory and System Dynamics to develop causal loop diagrams/models for data warehouse quality and processes. We used the top 51 data warehousing academic papers to arrive at concepts and critical success factors. A simple data warehouse quality causal model and a Data Warehouse Project Initialization Loop Analysis, Data Source Availability & Monitoring Loop Analysis and Data Model Quality and DBMS Quality Analysis models were developed. Visualization of the cause-effect loops and how data warehouse variables are interrelated provide a clear understanding of DW process. Key findings include data quality and data model quality that are more important than DBMS quality for ensuring data warehouse quality, and the number of data entry errors and the level of data complexity can be major detriments to DW quality.  相似文献   

7.
Decision support systems help the decision making process with the use of OLAP (On-Line Analytical Processing) and data warehouses. These systems allow the analysis of corporate data. As OLAP and data warehousing evolve, more and more complex data is being used. XML (Extensible Markup Language) is a flexible text format allowing the interchange and the representation of complex data. Finding an appropriate model for an XML data warehouse tends to become complicated as more and more solutions appear. Hence, in this survey paper we present an overview of the different proposals that use XML within data warehousing technology. These proposals range from using XML data sources for regular warehouses to those using full XML warehousing solutions. Some researches merely focus on document storage facilities while others present adaptations of XML technology for OLAP. Even though there are a growing number of researches on the subject, many issues still remain unsolved.  相似文献   

8.
Extensible Markup Language (XML) is a common standard for data representation and exchange over the Web. Considering the increasing need for managing data on the Web, integration techniques are required to access heterogeneous XML sources. In this paper, we describe a unification method for heterogeneous XML schemata. The input to the unification method is a set of object-oriented-based canonical schemata that conceptually abstract local Document Type Definitions of the involved sources. The unification process applies specific algorithms and rules to the concepts of the canonical schemata to generate a preliminary ontology. Further adjustments on this preliminary ontology generate a reference ontology that acts as a front-end for user queries to the XML sources.  相似文献   

9.
We consider the issue of warehouse evaluation towards successful logistic and supply chain management. Suppose a company has managed a chain of owned warehouses, and now this company is in need of acquiring some new and profitable warehouse adding to its operation chain. A key business decisions here is how to choose the most profitable warehouses from a number of potential warehouses. In reality, the challenge is that the future profitability is unpredictable. Therefore, it is infeasible to rank potential warehouses directly for choice. To address such a problem, this paper proposes a new rule-based decision model. This model includes the following characteristics: (i) decision information is provided via interval-valued intuitionistic fuzzy values; (ii) multiple experts as a group of decision makers are involved; (iii) both subjective evaluations from experts and objective data of historical profitability are employed; (iv) both certain and uncertain information are exploited. The core decision mechanism is, making use of uncertain information of owned warehouses, to induce a collection of “if…then…”rules, and subsequently to exploit these rules for prediction of preference orders of all potential warehouses. Therein, we develop and integrate multiple techniques for the purposes of (a) aggregation of uncertain information; (b) construction of pairwise comparison; (c) induction of certain and uncertain rules; and (d) decision rules exploitation. We finally elaborate our discussion with a numerical example illustrating the application of the proposed decision mechanism to supply-chain domain problems.  相似文献   

10.
Data warehousing technologies have become mature enough to efficiently store and process huge data sets, which has shifted the data warehousing challenge from increasing data processing capacity to enriching data resources in order to provide better decision-making assistance. There have been reports that some organizations intend to recruit Web data into data warehouse systems as a means of responding to the challenge of enriching data resources, because infinite information has made the Internet the largest external database to each organization. However, there is not a systematical guideline to support such an intention. To fill this void, we introduce Web integration as a strategy to merge data warehouses and the Web, with an emphasis on effectively and efficiently acquiring Web data into data warehouses. We also point out that the critical step for Web integration is to acquire genuinely valuable business data from the Web. A framework for determining the business value of Web data is offered to facilitate Web integration efforts.  相似文献   

11.
Converting XML DTDs to UML diagrams for conceptual data integration   总被引:2,自引:0,他引:2  
Extensible Markup Language (XML) is fast becoming the new standard for data representation and exchange on the World Wide Web, e.g., in B2B e-commerce. Modern enterprises need to combine data from many sources in order to answer important business questions, creating a need for integration of web-based XML data. Previous web-based data integration efforts have focused almost exclusively on the logical level of data models, creating a need for techniques that focus on the conceptual level in order to communicate the structure and properties of the available data to users at a higher level of abstraction. The most widely used conceptual model at the moment is the Unified Modeling Language (UML).

This paper presents algorithms for automatically constructing UML diagrams from XML DTDs, enabling fast and easy graphical browsing of XML data sources on the web. The algorithms capture important semantic properties of the XML data such as precise cardinalities and aggregation (containment) relationships between the data elements. As a motivating application, it is shown how the generated diagrams can be used for the conceptual design of data warehouses based on web data, and an integration architecture is presented. The choice of data warehouses and On-Line Analytical Processing as the motivating application is another distinguishing feature of the presented approach.  相似文献   


12.
Data warehouse architectures rely on extraction, transformation and loading (ETL) processes for the creation of an updated, consistent and materialized view of a set of data sources. In this paper, we support these processes by proposing a tool that: (1) allows the semi-automatic definition of inter-attribute semantic mappings, by identifying the parts of the data source schemas which are related to the data warehouse schema, thus supporting the extraction process; and (2) groups the attribute values semantically related thus defining a transformation function for populating with homogeneous values the data warehouse.Our proposal couples and extends the functionalities of two previously developed systems: the MOMIS integration system and the RELEVANT data analysis system. The system has been experimented within a real scenario concerning the creation of a data warehouse for enterprises working in the beverage and food logistic area. The results showed that the coupled system supports effectively the extraction and transformation processes.  相似文献   

13.
《Computer Networks》1999,31(11-16):1155-1169
An important application of XML is the interchange of electronic data (EDI) between multiple data sources on the Web. As XML data proliferates on the Web, applications will need to integrate and aggregate data from multiple source and clean and transform data to facilitate exchange. Data extraction, conversion, transformation, and integration are all well-understood database problems, and their solutions rely on a query language. We present a query language for XML, called XML-QL, which we argue is suitable for performing the above tasks. XML-QL is a declarative, `relational complete' query language and is simple enough that it can be optimized. XML-QL can extract data from existing XML documents and construct new XML documents.  相似文献   

14.

To make informed decisions, managers establish data warehouses that integrate multiple data sources. However, the outcomes of the data warehouse-based decisions are not always satisfactory due to low data quality. Although many studies focused on data quality management, little effort has been made to explore effective data quality control strategies for the data warehouse. In this study, we propose a chance-constrained programming model that determines the optimal strategy for allocating the control resources to mitigate the data quality problems of the data warehouse. We develop a modified Artificial Bee Colony algorithm to solve the model. Our work contributes to the literature on evaluation of data quality problem propagation in data integration process and data quality control on the data sources that make up the data warehouse. We use a data warehouse in the healthcare organization to illustrate the model and the effectiveness of the algorithm.

  相似文献   

15.
一种支持异构数据集成的Web服务合成方法   总被引:1,自引:0,他引:1  
全立新  岳昆  刘惟一 《计算机应用》2007,27(6):1438-1441
基于“协作者”数据集成架构,以网络环境中的数据查询为基本Web服务、关系数据库和XML文档为异构数据源的典型代表,并以其上已有的查询处理和XML数据绑定技术为基础,给出了Web服务环境下的数据集成模型。通过定义该模型上的基本操作(服务),利用有向图结构描述服务合成过程,提出了支持异构数据集成的Web服务合成方法和相应的优化策略。  相似文献   

16.
数据挖掘技术在证券客户关系中的应用   总被引:2,自引:2,他引:0  
叶良 《计算机仿真》2009,26(12):270-273
研究证券管理问题,客户关系管理系统(CRM)是现代经营管理科学与现代信息技术结合的科学问题.数据挖掘技术是有效地利用现有数据资源的重要手段.重点是针对数据挖掘技术在证券客户关系管理中的具体问题.运用数据仓库技术建立了客户交易行为数据仓库,并运用聚类技术完成了基于证券公司客户交易行为数据仓库的证券公司客户细分.基于数据挖掘的CRM是对传统企业管理思想的一个创新,充分体现了管理的科学性和艺术性.对企业的经营决策和客户关系管理都具有相当重要的作用和意义.  相似文献   

17.
Web数据仓库的异步迭代查询处理方法   总被引:2,自引:0,他引:2  
何震瀛  李建中  高宏 《软件学报》2002,13(2):214-218
数据仓库信息量的飞速膨胀对数据仓库提出了巨大挑战.如何提高Web环境下数据仓库的查询效率成为数据仓库研究领域重要的研究问题.对Web数据仓库的体系结构和查询方法进行了研究和探讨.在分析几种Web数据仓库实现方法的基础上,提出了一种Web数据仓库的层次体系结构,并在此基础上提出了Web数据仓库的异步迭代查询方法.该方法充分利用了流水线并行技术,在Web数据仓库的查询处理过程中不同层次的结点以流水线方式运行,并行完成查询的处理,提高了查询效率.理论分析表明,该方法可以有效地提高Web数据仓库的查询效率.  相似文献   

18.
The eXtensible Markup Language (XML) has reached a wide acceptance as the relevant standardization for representing and exchanging data on the Web. Unfortunately, XML covers the syntactic level but lacks semantics, and thus cannot be directly used for the Semantic Web. Currently, finding a way to utilize XML data for the Semantic Web is challenging research. As we have known that ontology can formally represent shared domain knowledge and enable semantics interoperability. Therefore, in this paper, we investigate how to represent and reason about XML with ontologies. Firstly, we give formalized representations of XML data sources, including Document Type Definitions (DTDs), XML Schemas, and XML documents. On this basis, we propose formal approaches for transforming the XML data sources into ontologies, and we also discuss the correctness of the transformations and provide several transformation examples. Furthermore, following the proposed approaches, we implement a prototype tool that can automatically transform XML into ontologies. Finally, we apply the transformed ontologies for reasoning about XML, so that some reasoning problems of XML may be checked by the existing ontology reasoners.  相似文献   

19.
XML has already become the de facto standard for specifying and exchanging data on the Web. However, XML is by nature verbose and thus XML documents are usually large in size, a factor that hinders its practical usage, since it substantially increases the costs of storing, processing, and exchanging data. In order to tackle this problem, many XML-specific compression systems, such as XMill, XGrind, XMLPPM, and Millau, have recently been proposed. However, these systems usually suffer from the following two inadequacies: They either sacrifice performance in terms of compression ratio and execution time in order to support a limited range of queries, or perform full decompression prior to processing queries over compressed documents.In this paper, we address the above problems by exploiting the information provided by a Document Type Definition (DTD) associated with an XML document. We show that a DTD is able to facilitate better compression as well as generate more usable compressed data to support querying. We present the architecture of the XCQ, which is a compression and querying tool for handling XML data. XCQ is based on a novel technique we have developed called DTD Tree and SAX Event Stream Parsing (DSP). The documents compressed by XCQ are stored in Partitioned Path-Based Grouping (PPG) data streams, which are equipped with a Block Statistics Signature (BSS) indexing scheme. The indexed PPG data streams support the processing of XML queries that involve selection and aggregation, without the need for full decompression. In order to study the compression performance of XCQ, we carry out comprehensive experiments over a set of XML benchmark datasets. Wilfred Ng obtained his M.Sc.(Distinction) and Ph.D. degrees from the University of London. His research interests are in the areas of databases and information Systems, which include XML data, database query languages, web data management, and data mining. He is now an assistant professor in the Department of Computer Science, the Hong Kong University of Science and Technology (HKUST). Further Information can be found at the following URL: . Wai-Yeung Lam obtained his M.Phil. degree from the Hong Kong University of Science and Technology (HKUST) in 2003. His research thesis was based on the project “XCQ: A Framework for Querying Compressed XML Data.” He is currently working in industry. Peter Wood received his Ph.D. in Computer Science from the University of Toronto in 1989. He has previously studied at the University of Cape Town, South Africa, obtaining a B.Sc. degree in 1977 and an M.Sc. degree in Computer Science in 1982. Currently he is a senior lecturer at Birkbeck and a member of the Information Management and Web Technologies research group. His research interests include database and XML query languages, query optimisation, active and deductive rule languages, and graph algorithms. Mark Levene received his Ph.D. in Computer Science in 1990 from Birkbeck College, University of London, having previously been awarded a B.Sc. in Computer Science from Auckland University, New Zealand in 1982. He is currently professor of Computer Science at Birkbeck College, where he is a member of the Information Management and Web Technologies research group. His main research interests are Web search and navigation, Web data mining and stochastic models for the evolution of the Web. He has published extensively in the areas of database theory and web technologies, and has recently published a book called ‘An Introduction to Search Engines and Web Navigation’.  相似文献   

20.
张基温  杨叶勇 《计算机工程》2004,30(21):76-77,119
提出通过元数据和XML来建立核酸序列的数据仓库,实现核酸序列数据的真正共享,建立数据仓库的模型,并且运用元数据和XML的知识给出了实例。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号