共查询到20条相似文献,搜索用时 0 毫秒
1.
Alexandre A. B. Lima Camille Furtado Patrick Valduriez Marta Mattoso 《Distributed and Parallel Databases》2009,25(1-2):97-123
We consider the problem of improving the performance of OLAP applications in a database cluster (DBC), which is a low cost and effective parallel solution for query processing. Current DBC solutions for OLAP query processing provide for intra-query parallelism only, at the cost of full replication of the database. In this paper, we propose more efficient distributed database design alternatives which combine physical/virtual partitioning with partial replication. We also propose a new load balancing strategy that takes advantage of an adaptive virtual partitioning to redistribute the load to the replicas. Our experimental validation is based on the implementation of our solution on the SmaQSS DBC middleware prototype. Our experimental results using the TPC-H benchmark and a 32-node cluster show very good speedup. 相似文献
2.
Approximate query processing has emerged as an approach to dealing with the huge data volume and complex queries in the environment of data warehouse.In this paper,we present a novel method that provides approximate answers to OLAP queries.Our method is based on building a compressed (approximate) data cube by a clustering technique and using this compressed data cube to provide answers to queries directly,so it improves the performance of the queries.We also provide the algorithm of the OLAP queries and the confidence intervals of query results.An extensive experimental study with the OLAP council benchmark shows the effectiveness and scalability of our cluster-based approach compared to sampling. 相似文献
3.
OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method to rewrite a given OLAP query using various kinds of materialized views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the selection and aggregation granularities, which are derived from the lattice of dimension hierarchies. Conditions for usability of materialized views in rewriting a given query are specified by relationships between the components of their normal forms. We present a rewriting algorithm for OLAP queries that can effectively utilize materialized views having different selection granularities, selection regions, and aggregation granularities together. We also propose an algorithm to find a set of materialized views that results in a rewritten query which can be executed efficiently. We show the effectiveness and performance of the algorithm experimentally. 相似文献
4.
Active data warehouses: complementing OLAP with analysis rules 总被引:2,自引:0,他引:2
Conventional data warehouses are passive. All tasks related to analysing data and making decisions must be carried out manually by analysts. Today's data warehouse and OLAP systems offer little support to automatize decision tasks that occur frequently and for which well-established decision procedures are available. Such a functionality can be provided by extending the conventional data warehouse architecture with analysis rules, which mimic the work of an analyst during decision making. Analysis rules extend the basic event/condition/action (ECA) rule structure with mechanisms to analyse data multidimensionally and to make decisions. The resulting architecture is called active data warehouse. 相似文献
5.
随着互联网的迅猛发展,监控网络的所产生的海量数据对查询处理提出挑战。根据数据明显分为大量的事件数据和少量、稳定的配置数据的特点,提出了一种基于单机DBMS的并行查询处理方法。从关系代数的角度,将任意查询分解成对水平数据分区的子查询和汇总中间结果的后处理查询。借助DBMS提供的数据库链路,在不改动DBMS的情况下,方便地构造查询处理器。用真实负载的测试表明:在中间结果集不很大的情况下,能获得接近线性的扩展比。 相似文献
6.
7.
Inter-business collaborative contexts prefigure a distributed scenario where companies organize and coordinate themselves to develop common and shared opportunities, but traditional business intelligence systems do not provide support to this end. To fill this gap, in this paper we envision a peer-to-peer data warehousing architecture based on a network of heterogeneous peers, each exposing query answering functionalities aimed at sharing business information. To enhance the decision making process, an OLAP query expressed on a peer needs to be properly reformulated on the local multidimensional schemata of the other peers. To this end, we present a language for the definition of mappings between the multidimensional schemata of peers and we introduce a query reformulation framework that relies on the translation of mappings, queries, and multidimensional schemata onto the relational level. Then, we formalize a query reformulation algorithm and prove two properties: correctness and closure, that are essential in a peer-to-peer setting. Finally, we discuss the main implementation issues related to the reformulation setting proposed, with specific reference to the case in which the local multidimensional engines hosted by peers use the standard MDX language. 相似文献
8.
在本文中,我们探讨一种基于并行处理技术并且能够改进数据仓库查询方法。而且,我们设计运算法则对于任务和数据进行分割来实现并行星型联结。 相似文献
9.
3D-List: a data structure for efficient video query processing 总被引:1,自引:0,他引:1
A video query model based on the content of video and iconic indexing is proposed. We extend the notion of two-dimensional strings to three-dimensional strings (3D-Strings) for representing the spatial and temporal relationships among the symbols in both a video and a video query. The problem of video query processing is then transformed into a problem of three-dimensional pattern matching. To efficiently match the 3D-Strings, a data structure, called 3D-List, and its related algorithms are proposed. In this approach, the symbols of a video in the video database are retrieved from the video index and organized as a 3D-List according to the 3D-String of the video query. The related algorithms are then applied on the 3D-List to determine whether this video is an answer to the video query. Based on this approach, we have started a project called Vega. In this project, we have implemented a user friendly interface for specifying video queries, a video index tool for constructing the video index, and a video query processor based on the notion of 3D-List. Some experiments are also performed to show the efficiency and effectiveness of the proposed algorithms 相似文献
10.
11.
利用联机分析处理(OLAP)查询中存在的语义关联,对聚集关系与语义分解关系进行了形式化描述,并基于这些关系定义了查询与查询集之间的补集关系,在执行OLAP查询集时,可以利用这些关系尽可能地识别查询集中查询的公共部分,并且可以在查询时从多个角度来采取并行优化措施。实验验证表明采用并行优化方案后,系统的整体效率得到了提高。 相似文献
12.
多核和众核处理器成为新的具有强大并行处理能力的大内存计算平台的主流配置。多核处理器遵循以LLC(Last Level Cache,最后一级cache)大小为中心的优化技术,而众核处理器,如Phi、GPU协处理器,则采用较小的cache并以更多的硬件级线程来掩盖内存访问延迟的设计。随着处理核心数量的增长,计算框架更倾向于面向大规模处理核心的、代码执行效率高并且扩展性强的设计思想。提出了一种基于数组存储和向量处理的内存分析处理框架Array OLAP,简化OLAP的存储模型和查询处理模型。在Array OLAP计算框架中,维表规范化为基于向量的维过滤器,事实表规范化为带有多维索引的度量属性。通过多维索引计算,一个多维查询被简化为事实表上的向量索引扫描并根据度量表达式进行聚集计算。规范化的向量查找和向量索引扫描具有较好的代码执行效率,并且阶段化的处理模型更好地适应不同的计算平台,将计算阶段分配给最适合的计算平台。同时,Array OLAP是一种面向数据仓库模式特点的设计,向量处理模型设计简单,对于数据仓库维表较小且增长缓慢的特点具有较好的效率。描述了在不同平台上的Array OLAP计算框架并且通过基准测试评估Array OLAP的性能,通过与当前的内存分析型数据库的性能对比,Array OLAP性能超过主流的内存分析型数据库并且可以平滑地迁移到新的硬件平台。 相似文献
13.
多维数据实视图选择问题是一个NP完全问题。提出一种基于约束的多目标优化遗传算法,将查询代价和维护代价分开考虑,更有效地解决复杂的实视图选择问题。实验结果表明,该算法具有更好的性能,特别是在获得的Pareto前沿的分布性上。 相似文献
14.
在大型数据仓库查询过程中,经常涉及多事实表的连接操作。传统的查询优化方法是在计算多关系连接时尽可能地减少中间关系的大小,并没有考虑到数据仓库中数据的海量,以读为主且事实表一般建有索引的特点,往往无法取得最优的效果。针对数据仓库查询的特点,提出了一种利用索引加快查询的启发式优化方法。理论分析与实验表明,该方法在查询处理代价和执行时间上都明显减少,方法具有有效性。 相似文献
15.
Rodrigo Costa Mateus Thiago Luís Lopes Siqueira Valéria Cesário Times Ricardo Rodrigues Ciferri Cristina Dutra de Aguiar Ciferri 《Distributed and Parallel Databases》2016,34(3):425-461
Cloud computing systems handle large volumes of data by using almost unlimited computational resources, while spatial data warehouses (SDWs) are multidimensional databases that store huge volumes of both spatial data and conventional data. Cloud computing environments have been considered adequate to host voluminous databases, process analytical workloads and deliver database as a service, while spatial online analytical processing (spatial OLAP) queries issued over SDWs are intrinsically analytical. However, hosting a SDW in the cloud and processing spatial OLAP queries over such database impose novel obstacles. In this article, we introduce novel concepts as cloud SDW and spatial OLAP as a service, and afterwards detail the design of novel schemas for cloud SDW and spatial OLAP query processing over cloud SDW. Furthermore, we evaluate the performance to process spatial OLAP queries in cloud SDWs using our own query processor aided by a cloud spatial index. Moreover, we describe the cloud spatial bitmap index to improve the performance to process spatial OLAP queries in cloud SDWs, and assess it through an experimental evaluation. Results derived from our experiments revealed that such index was capable to reduce the query response time from 58.20 up to 98.89 %. 相似文献
16.
17.
Mikal Ziane Ph.D. Mohamed Zaït Ph.D. student Projet Rodin Pascale Borla-Salamet Ph.D. 《The VLDB Journal The International Journal on Very Large Data Bases》1993,2(3):277-301
In this article, we describe our approach to the compile-time optimization and parallelization of queries for execution in DBS3 or EDS. DBS3 is a shared-memory parallel database system, while the EDS system has a distributed-memory architecture. Because DBS3 implements a parallel dataflow execution model, this approach applies to both architectures. Using randomized search strategies enables the exploration of a search space large enough to include zigzag trees, which are intermediate between left-deep and right-deep trees. Zigzag trees are shown to provide better response time than right-deep trees in case of limited memory. Performance measurements obtained using the DBS3 prototype show the advantages of zigzag trees under various conditions. 相似文献
18.
《Expert systems with applications》2014,41(7):3237-3249
Given a D-dimensional data set P and a query point q, a reverse skyline query (RSQ) returns all the data objects in P whose dynamic skyline contains q. It is important for many real life applications such as business planning and environmental monitoring. Currently, the state-of-the-art algorithm for answering the RSQ is the reverse skyline using skyline approximations (RSSA) algorithm, which is based on the precomputed approximations of the skylines. Although RSSA has some desirable features, e.g., applicability to arbitrary data distributions and dimensions, it needs for multiple accesses of the same nodes, incurring redundant I/O and CPU costs. In this paper, we propose several efficient algorithms for exact RSQ processing over multidimensional datasets. Our methods utilize a conventional data-partitioning index (e.g., R-tree) on the dataset P, and employ precomputation, reuse, and pruning techniques to boost the query performance. In addition, we extend our techniques to tackle a natural variant of the RSQ, i.e., constrained reverse skyline query (CRSQ), which retrieves the reverse skyline inside a specified constrained region. Extensive experimental evaluation using both real and synthetic datasets demonstrates that our proposed algorithms outperform RSSA by several orders of magnitude under all experimental settings. 相似文献
19.
The development of data warehouses begins with the definition of multidimensional models at the conceptual level in order to structure data, which will facilitate decision makers with an easier data analysis. Current proposals for conceptual multidimensional modelling focus on the design of static data warehouse structures, but few approaches model the queries which the data warehouse should support by means of OLAP (on-line analytical processing) tools. OLAP queries are, therefore, only defined once the rest of the data warehouse has been implemented, which prevents designers from verifying from the very beginning of the development whether the decision maker will be able to obtain the required information from the data warehouse. This article presents a solution to this drawback consisting of an extension to the object constraint language (OCL), which has been developed to include a set of predefined OLAP operators. These operators can be used to define platform-independent OLAP queries as a part of the specification of the data warehouse conceptual multidimensional model. Furthermore, OLAP tools require the implementation of queries to assure performance optimisations based on pre-aggregation. It is interesting to note that the OLAP queries defined by our approach can be automatically implemented in the rest of the data warehouse, in a coherent and integrated manner. This implementation is supported by a code-generation architecture aligned with model-driven technologies, in particular the MDA (model-driven architecture) proposal. Finally, our proposal has been validated by means of a set of sample data sets from a well-known case study. 相似文献
20.
Thiago Luís Lopes Siqueira Cristina Dutra de Aguiar Ciferri Valéria Cesário Times Ricardo Rodrigues Ciferri 《GeoInformatica》2012,16(1):165-205
Spatial data warehouses (SDWs) allow for spatial analysis together with analytical multidimensional queries over huge volumes
of data. The challenge is to retrieve data related to ad hoc spatial query windows according to spatial predicates, avoiding the high cost of joining large tables. Therefore, mechanisms
to provide efficient query processing over SDWs are essential. In this paper, we propose two efficient indices for SDW: the
SB-index and the HSB-index. The proposed indices share the following characteristics. They enable multidimensional queries
with spatial predicate for SDW and also support predefined spatial hierarchies. Furthermore, they compute the spatial predicate
and transform it into a conventional one, which can be evaluated together with other conventional predicates by accessing
a star-join Bitmap index. While the SB-index has a sequential data structure, the HSB-index uses a hierarchical data structure
to enable spatial objects clustering and a specialized buffer-pool to decrease the number of disk accesses. The advantages
of the SB-index and the HSB-index over the DBMS resources for SDW indexing (i.e. star-join computation and materialized views)
were investigated through performance tests, which issued roll-up operations extended with containment and intersection range
queries. The performance results showed that improvements ranged from 68% up to 99% over both the star-join computation and
the materialized view. Furthermore, the proposed indices proved to be very compact, adding only less than 1% to the storage
requirements. Therefore, both the SB-index and the HSB-index are excellent choices for SDW indexing. Choosing between the
SB-index and the HSB-index mainly depends on the query selectivity of spatial predicates. While low query selectivity benefits
the HSB-index, the SB-index provides better performance for higher query selectivity. 相似文献