首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
一种P2P网络环境下的OLAP查询方案   总被引:1,自引:1,他引:0       下载免费PDF全文
传统网络环境和P2P环境中,客户端向OLAP服务器提交OLAP查询,并从服务器获取查询结果,OLAP服务器的负载将随着客户端的增加而急剧增加。设计了一种基于P2P(Peer-to-Peer,点对点技术)技术的DQDC(Distributed Query Data Cube,多维数据集的分布式查询)算法,实现P2P网络中语义级的多节点Data Cube数据共享,从而提高系统整体的决策分析性能。  相似文献   

2.
魏莉  杨科华 《计算机应用》2010,30(7):1956-1958
利用联机分析处理(OLAP)查询中存在的语义关联,对聚集关系与语义分解关系进行了形式化描述,并基于这些关系定义了查询与查询集之间的补集关系,在执行OLAP查询集时,可以利用这些关系尽可能地识别查询集中查询的公共部分,并且可以在查询时从多个角度来采取并行优化措施。实验验证表明采用并行优化方案后,系统的整体效率得到了提高。  相似文献   

3.
Compressed Data Cube for Approximate OLAP Query Processing   总被引:4,自引:0,他引:4       下载免费PDF全文
Approximate query processing has emerged as an approach to dealing with the huge data volume and complex queries in the environment of data warehouse.In this paper,we present a novel method that provides approximate answers to OLAP queries.Our method is based on building a compressed (approximate) data cube by a clustering technique and using this compressed data cube to provide answers to queries directly,so it improves the performance of the queries.We also provide the algorithm of the OLAP queries and the confidence intervals of query results.An extensive experimental study with the OLAP council benchmark shows the effectiveness and scalability of our cluster-based approach compared to sampling.  相似文献   

4.
OLAP技术的发展新动态   总被引:5,自引:0,他引:5  
本文从OLAP技术与Data Mining、Web的集成,分布式OLAP技术,OLAP技术与高级数据库技术的结合,三个方面论述了OLAP技术的最新发展动态。  相似文献   

5.
OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method to rewrite a given OLAP query using various kinds of materialized views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the selection and aggregation granularities, which are derived from the lattice of dimension hierarchies. Conditions for usability of materialized views in rewriting a given query are specified by relationships between the components of their normal forms. We present a rewriting algorithm for OLAP queries that can effectively utilize materialized views having different selection granularities, selection regions, and aggregation granularities together. We also propose an algorithm to find a set of materialized views that results in a rewritten query which can be executed efficiently. We show the effectiveness and performance of the algorithm experimentally.  相似文献   

6.
Dissemination of XML data on the internet could breach the privacy of data providers unless access to the disseminated XML data is carefully controlled. Recently, the methods using encryption have been proposed for such access control. However, in these methods, the performance of processing queries has not been addressed. A query processor cannot identify the contents of encrypted XML data unless the data are decrypted. This limitation incurs overhead of decrypting the parts of the XML data that would not contribute to the query result. In this paper, we propose the notion of Query-Aware Decryption for efficient processing of queries against encrypted XML data. Query-Aware Decryption allows us to decrypt only those parts that would contribute to the query result. For this purpose, we disseminate an encrypted XML index along with the encrypted XML data. This index, when decrypted, informs us where the query results are located in the encrypted XML data, thus preventing unnecessary decryption for other parts of the data. Since the size of this index is much smaller than that of the encrypted XML data, the cost of decrypting this index is negligible compared with that for unnecessary decryption of the data itself. The experimental results show that our method improves the performance of query processing by up to six times compared with those of existing methods. Finally, we formally prove that dissemination of the encrypted XML index does not compromise security.  相似文献   

7.
Since the early 1970s, decision support systems (DSS) have evolved significantly. In this paper, the design and implementation of MSMiner, a developing platform for DSS, is introduced. The system is constructed on a data warehouse and integrated with a number of data mining algorithms. It is well suited for on-line analytical processing (OLAP). The characteristics of MSMiner include the ability to support multiple data sources and data mining strategies, additional organizational flexibility in regard to data and mining strategies, and the powerful expansibility of data mining tasks.  相似文献   

8.
Query reformulation for dynamic information integration   总被引:17,自引:0,他引:17  
The standard approach to integrating heterogeneous information sources is to build a global schema that relates all of the information in the different sources, and to pose queries directly against it. The problem is that schema integration is usually difficult, and as soon as any of the information sources change or a new source is added, the process may have to be repeated.The SIMS system uses an alternative approach. A domain model of the application domain is created, establishing a fixed vocabulary for describing data sets in the domain. Using this language, each available information source is described. Queries to SIMS against the collection of available information sources are posed using terms from the domain model, and reformulation operators are employed to dynamically select an appropriate set of information sources and to determine how to integrate the available information to satisfy a query. This approach results in a system that is more flexible than existing ones, more easily scalable, and able to respond dynamically to newly available or unexpectedly missing information sources.This paper describes the query reformulation process in SIMS and the operators used in it. We provide precise definitions of the reformulation operators and explain the rationale behind choosing the specific ones SIMS uses. We have demonstrated the feasibility and effectiveness of this approach by applying SIMS in the domains of transportation planning and medical trauma care.  相似文献   

9.
The development of data warehouses begins with the definition of multidimensional models at the conceptual level in order to structure data, which will facilitate decision makers with an easier data analysis. Current proposals for conceptual multidimensional modelling focus on the design of static data warehouse structures, but few approaches model the queries which the data warehouse should support by means of OLAP (on-line analytical processing) tools. OLAP queries are, therefore, only defined once the rest of the data warehouse has been implemented, which prevents designers from verifying from the very beginning of the development whether the decision maker will be able to obtain the required information from the data warehouse. This article presents a solution to this drawback consisting of an extension to the object constraint language (OCL), which has been developed to include a set of predefined OLAP operators. These operators can be used to define platform-independent OLAP queries as a part of the specification of the data warehouse conceptual multidimensional model. Furthermore, OLAP tools require the implementation of queries to assure performance optimisations based on pre-aggregation. It is interesting to note that the OLAP queries defined by our approach can be automatically implemented in the rest of the data warehouse, in a coherent and integrated manner. This implementation is supported by a code-generation architecture aligned with model-driven technologies, in particular the MDA (model-driven architecture) proposal. Finally, our proposal has been validated by means of a set of sample data sets from a well-known case study.  相似文献   

10.
向量计算Array OLAP查询处理技术   总被引:1,自引:0,他引:1       下载免费PDF全文
多核和众核处理器成为新的具有强大并行处理能力的大内存计算平台的主流配置。多核处理器遵循以LLC(Last Level Cache,最后一级cache)大小为中心的优化技术,而众核处理器,如Phi、GPU协处理器,则采用较小的cache并以更多的硬件级线程来掩盖内存访问延迟的设计。随着处理核心数量的增长,计算框架更倾向于面向大规模处理核心的、代码执行效率高并且扩展性强的设计思想。提出了一种基于数组存储和向量处理的内存分析处理框架Array OLAP,简化OLAP的存储模型和查询处理模型。在Array OLAP计算框架中,维表规范化为基于向量的维过滤器,事实表规范化为带有多维索引的度量属性。通过多维索引计算,一个多维查询被简化为事实表上的向量索引扫描并根据度量表达式进行聚集计算。规范化的向量查找和向量索引扫描具有较好的代码执行效率,并且阶段化的处理模型更好地适应不同的计算平台,将计算阶段分配给最适合的计算平台。同时,Array OLAP是一种面向数据仓库模式特点的设计,向量处理模型设计简单,对于数据仓库维表较小且增长缓慢的特点具有较好的效率。描述了在不同平台上的Array OLAP计算框架并且通过基准测试评估Array OLAP的性能,通过与当前的内存分析型数据库的性能对比,Array OLAP性能超过主流的内存分析型数据库并且可以平滑地迁移到新的硬件平台。  相似文献   

11.
改善OLAP系统综合性能的研究   总被引:2,自引:0,他引:2  
改善OLAP系统综合性能关系到OLAP数据源关系数据库的设计、数据转换和整理、维和立方体的结构、MDX查询语句及计算任务分配等多个方面的优化。论文借助一个工程实例给出改善OLAP系统综合性能的几个重要的方法。  相似文献   

12.
目前,P2P环境下的OLAP查询策略都是基于从客户端获取查询结果集,如DSCD算法和DQDC算法等主要是研究怎样快速地从客户端获取查询结果集,由于客户端的Data Cube的实时数据更新效率低,易导致查询结果失真,从而影响OLAP的查询效率。为了提高P2P网络中OLAP的实时查询效率,提出了一种RTOS(Real-time Semantic OLAP Search,实时语义的OLAP查询)算法,并结合查询速度和失真率两方面的实验证明,该算法能有效地提高P2P环境下OLAP的决策分析性能。  相似文献   

13.
To efficiently support automated interoperability between ontology-based information systems in distributed environments, the semantic heterogeneity problem has to be dealt with. To do so, traditional approaches have acquired and employed explicit mappings between the corresponding ontologies. Usually these mappings can be only obtained from human domain experts. However, it is too expensive and time-consuming to collect all possible mapping results on distributed information systems. More seriously, as the number of systems in a large-scale peer-to-peer (P2P) network increases, the efficiency of the ontology mapping is exponentially decreased. Thereby, in this paper, we propose a novel semantic P2P system, which is capable of (i) sharing and exchanging existing mappings among peers, and (ii) composing shared mappings to build a certain path between two systems. Given two arbitrary peers (i.e., source and destination), the proposed system can provide indirect ontology mappings to make them interoperable. In particular, we have focused on query-based communication for evaluating the proposed ontology mapping composition system. Once direct ontology mappings are collected from candidate peers, a given query can be (i) segmented into a set of sub-queries, and (ii) transformed to another query. With respect to the precision performance, our experimentation has shown an improvement of about 42.5% compared to the keyword-based query searching method.  相似文献   

14.
Schemaless databases, and document-oriented databases in particular, are preferred to relational ones for storing heterogeneous data with variable schemas and structural forms. However, the absence of a unique schema adds complexity to analytical applications, in which a single analysis often involves large sets of data with different schemas. In this paper we propose an original approach to OLAP on collections stored in document-oriented databases. The basic idea is to stop fighting against schema variety and welcome it as an inherent source of information wealth in schemaless sources. Our approach builds on four stages: schema extraction, schema integration, FD enrichment, and querying; these stages are discussed in detail in the paper. To make users aware of the impact of schema variety, we propose a set of indicators inspired by the definition of attribute density. Finally, we experimentally evaluate our approach in terms of efficiency and effectiveness.  相似文献   

15.
In this paper we demonstrate that it is possible to enrich query answering with a short data movie that gives insights to the original results of an OLAP query. Our method, implemented in an actual system, CineCubes, includes the following steps. The user submits a query over an underlying star schema. Taking this query as input, the system comes up with a set of queries complementing the information content of the original query, and executes them. For each of the query results, we execute a set of highlight extraction algorithms that identify interesting patterns and values in the data of the results. Then, the system visualizes the query results and accompanies this presentation with a text commenting on the result highlights. Moreover, via a text-to-speech conversion the system automatically produces audio for the constructed text. Each combination of visualization, text and audio practically constitutes a movie, which is wrapped as a PowerPoint presentation and returned to the user.  相似文献   

16.
Biao  Yuni   《Data & Knowledge Engineering》2008,67(3):485-503
Managing uncertain information using probabilistic databases has drawn much attention recently in many fields. Generating efficient safe plans is the key to evaluating queries whose data complexities are PTIME. In this paper, we propose a new approach generating efficient safe plans for queries. Our algorithm adopts effective preprocessing and multiway split techniques, thus the generating safe plans avoid unnecessary probabilistic cartesian-products and have the minimum number of probabilistic projections. Further, we extend existing transformation rules to allow the safe plans generated by the Safe-Plan algorithm [N. Dalvi, D. Suciu, Efficient query evaluation on probabilistic database, The VLDB Journal 16 (4) (2007) 523–544] and the proposed algorithm to be transformed by each other. Applying our approach through the TPC-H benchmark queries, the experiments show that the safe plans generated by our algorithm are more efficient than those generated by the Safe-Plan algorithm.  相似文献   

17.
Specifying OLAP Cubes on XML Data   总被引:6,自引:0,他引:6  
On-Line Analytical Processing (OLAP) enables analysts to gain insight about data through fast and interactive access to a variety of possible views on information, organized in a dimensional model. The demand for data integration is rapidly becoming larger as more and more information sources appear in modern enterprises. In the data warehousing approach, selected information is extracted in advance and stored in a repository, yielding good query performance. However, in many situations a logical (rather than physical) integration of data is preferable. Previous web-based data integration efforts have focused almost exclusively on the logical level of data models, creating a need for techniques focused on the conceptual level. Also, previous integration techniques for web-based data have not addressed the special needs of OLAP tools such as handling dimensions with hierarchies. Extensible Markup Language (XML) is fast becoming the new standard for data representation and exchange on the World Wide Web. The rapid emergence of XML data on the web, e.g., business-to-business (B2B) e-commerce, is making it necessary for OLAP and other data analysis tools to handle XML data as well as traditional data formats.Based on a real-world case study, this paper presents an approach to specification of OLAP DBs based on web data. Unlike previous work, this approach takes special OLAP issues such as dimension hierarchies and correct aggregation of data into account. Also, the approach works on the conceptual level, using Unified Modeling Language (UML) as a basis for so-called UML snowflake diagrams that precisely capture the multidimensional structure of the data. An integration architecture that allows the logical integration of XML and relational data sources for use by OLAP tools is also presented.  相似文献   

18.
数据立方体选择的改进遗传算法   总被引:1,自引:0,他引:1  
董红斌  陈佳 《计算机科学》2010,37(11):152-155
数据立方体选择问题是一个NP完全问题。研究了利用遗传算法来解决立方体选择问题,提出了一个结合局部搜索机制的遗传算法。这一算法的核心思想在于,首先运用一个基于单位空间最大收益值的预处理算法来生成初始解,然后该初始解经结合了局部搜索机制的遗传算法进行提高。实验结果表明,该算法在寻优性能上优于启发式算法和经典遗传算法。  相似文献   

19.
林璇  冯健文  陈启买 《计算机工程与设计》2006,27(21):4142-4144,4156
餐饮联机事务处理系统(OLTP)能规范企业业务流程和提高业务运转效率,但对企业决策的支持能力差,因此建立基于餐饮企业业务数据集的后台决策支持系统(DSS)是有必要的。提出了采用数据仓库,OLAP和数据挖掘技术建立餐饮决策支持系统的思路,重点讨论系统的体系结构设计,基于原型法,从开始的数据仓库模型设计,数据的抽取、转换和加载及多维数据集的设计,到最后的OLAP分析,对系统的实施进行了探讨。  相似文献   

20.
Traditionally, distributed query optimization techniques generate static query plans at compile time. However, the optimality of these plans depends on many parameters (such as the selectivities of operations, the transmission speeds and workloads of servers) that are not only difficult to estimate but are also often unpredictable and fluctuant at runtime. As the query processor cannot dynamically adjust the plans at runtime, the system performance is often less than satisfactory. In this paper, we introduce a new highly adaptive distributed query processing architecture. Our architecture can quickly detect fluctuations in selectivities of operations, as well as transmission speeds and workloads of servers, and accordingly change the operation order of a distributed query plan during execution. We have implemented a prototype based on the Telegraph system [Telegragraph project. Available from >]. Our experimental study shows that our mechanism can adapt itself to the changes in the environment and hence approach to an optimal plan during execution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号