首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 62 毫秒
1.
The results of data cube will occupy huge amount of disk space when the base table is of a large number of attributes. A new type of data cube, compact data cube like condensed cube and quotient cube, was proposed to solve the problem. It compresses data cube dramatically. However, its query cost is so high that it cannot be used in most applications. This paper introduces the semi-closed cube to reduce the size of data cube and achieve almost the same query response time as the data cube does. Semi-closed cube is a generalization of condensed cube and quotient cube and is constructed from a quotient cube. When the query cost of quotient cube is higher than a given threshold, semi-closed cube selects some views and picks a fellow for each of them. All the tuples of those views are materialized except those closed by their fellows. To find a tuple of those views, users only need to scan the view and its fellow. Thus, their query performance is improved. Experiments were conducted using a real-world data set. The results show that semi-closed cube is an effective approach of data cube.  相似文献   

2.
Parallel data processing is a promising approach for efficiently computing data cube in relational databases, because most aggregate functions used in OLAP (On-Line Analytical Processing) are distributive functions. This paper studies the issues of handling data skew in parallel data cube computation. We present a fully dynamic partitioning approach that can effectively distribute workload among processing nodes without priori knowledge of data distribution. As supplement, a simple and effective dynamic load balancing mechanism is also incorporated into our algorithm, which further improves the overall performance. Our experimental results indicated that the proposed techniques are effective even when high data skew exists. The results of scale-up and speedup tests are also satisfactory.  相似文献   

3.
Star Cube--一种高效的数据立方体实现方法   总被引:1,自引:2,他引:1  
一个具有n个维的数据立方体有2^n个视图,视图越多,用于维护数据立方体的时间也就越长。通过将维分成划分维和非划分维,数据立方体可以转换成star cube.stal cube由一个综合表和那些仅包含划分维的视图组成。star cube使用前缀共享和元组共享技术不仅减少了所需的存储空间,还大大减少了计算和维护时间。在把一个分片限制在一个I/O单位的条件下,star cube的查询响应时间与数据立方体基本相同。实验结果也表明,star cube是一种在时空两方面均有效的数据立方体实现技术。  相似文献   

4.
Cube算子的计算在OLAP应用中起着极为重要的作用。本文分析了在高维Cube算子计算中传统流水线方法的不足之处,提出了通过有选择地实例化Cube中的部分节点以提高OLAP性能的解决方案,并给出了一个获取需要实例化节点的算法。  相似文献   

5.
MapReduce环境下的并行Dwarf立方构建   总被引:1,自引:0,他引:1  
针对数据密集型应用,提出了一种基于MapReduce框架的并行Dwarf数据立方构建算法.算法将传统Dwarf立方等价分割为多个独立的子Dwarf立方,采用MapReduce架构,实现了Dwarf立方的并行构建、查询和更新.实验证明,并行Dwarf算法一方面结合了MapReduce框架的并行性和高可扩展性,另一方面结合...  相似文献   

6.
研究了基于空间数据仓库的一种决策分析工具——空间在线分析处理(OLAP)的支撑技术。将普通数据立方体与空间数据立方体进行比较,提出空间数据立方体的维和度量的建模方法,解决了空间维与非空间维、空间度量与数值度量的集成建模问题。  相似文献   

7.
联机分析处理和数据挖掘是两种重要的数据分析方法。使用数据立方体作为数据存储结构,将两者集成起来,使得用户可以从不同角度、不同抽象层次分析数据。针对数据立方体的特点,本文提出了挖掘维间关联规则的算法,并编程实现了该算法,取得满意的结果。  相似文献   

8.
许睿  刘文才 《计算机工程与应用》2002,38(21):210-211,215
数据仓库及OLAP技术是当前数据库领域研究的热点,而数据模型又是数据仓库及OLAP核心基础。文章提出了一种应用于OLAP的数据模型,并用于实际应用中。这种数据模型在概念上表达了OLAP特性,支持OLAP操作,而且其数学代数简单明白地表达了OLAP查询。  相似文献   

9.
Parallelizing the Data Cube   总被引:1,自引:0,他引:1  
This paper presents a general methodology for the efficient parallelization of existing data cube construction algorithms. We describe two different partitioning strategies, one for top-down and one for bottom-up cube algorithms. Both partitioning strategies assign subcubes to individual processors in such a way that the loads assigned to the processors are balanced. Our methods reduce inter processor communication overhead by partitioning the load in advance instead of computing each individual group-by in parallel. Our partitioning strategies create a small number of coarse tasks. This allows for sharing of prefixes and sort orders between different group-by computations. Our methods enable code reuse by permitting the use of existing sequential (external memory) data cube algorithms for the subcube computations on each processor. This supports the transfer of optimized sequential data cube code to a parallel setting.The bottom-up partitioning strategy balances the number of single attribute external memory sorts made by each processor. The top-down strategy partitions a weighted tree in which weights reflect algorithm specific cost measures like estimated group-by sizes. Both partitioning approaches can be implemented on any shared disk type parallel machine composed of p processors connected via an interconnection fabric and with access to a shared parallel disk array.We have implemented our parallel top-down data cube construction method in C++ with the MPI message passing library for communication and the LEDA library for the required graph algorithms. We tested our code on an eight processor cluster, using a variety of different data sets with a range of sizes, dimensions, density, and skew. Comparison tests were performed on a SunFire 6800. The tests show that our partitioning strategies generate a close to optimal load balance between processors. The actual run times observed show an optimal speedup of p.  相似文献   

10.
基于数据立方体的数据仓库安全控制   总被引:1,自引:0,他引:1       下载免费PDF全文
周海晴  陈启买  刘海 《计算机工程》2010,36(10):152-154
针对数据仓库与在线分析处理(OLAP)系统存在的数据仓库非法访问和敏感信息间接推理问题,在原有统计数据库安全体系架构的基础上,构建OLAP的3层安全控制体系架构,并结合该架构提出一种新的基于数据立方体的推理控制方法。该方法先预防m维推理,然后清除一维推理,简化了m维推理的检测过程。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号