共查询到18条相似文献,搜索用时 640 毫秒
1.
物化视图能够有效地提高空间数据仓库的查询效率,但由于空间操作的复杂性,传统数据仓库中物化视图的选择算法不能很好地应用于空间数据仓库。为了在存储空间约束下选择查询进行物化,并动态调整物化视图集,以适应用户查询的时变性和即席查询,提出了空间物化视图选择算法SMVS。实验结果表明该算法是有效可行的,不仅能够提高查询性能,而且解决了查询响应性能随用户查询分布变化而下降的问题。 相似文献
2.
查询速度是联机分析处理中的一个关键性能指标,人们通过事先生成所有可能的聚集来提高查询速度,然而这样的完全物化是以存储空间为代价的.针对数据立方体数据分布特点和结合压缩技术,本文介绍如何最大化节省存储空间来进行完全物化,然后在此基础上对查询进行了研究,以达到最小存储空间以及较好的查询速度的目的. 相似文献
3.
基于兴趣视图子集的流立方体计算方法 总被引:1,自引:0,他引:1
流立方体计算是流式数据多维分析的重要基础,然而流式数据的动态性、无限性、突发性等特征使其面临巨大的挑战.在实际应用中,用户的兴趣通常集中在部分视图上,基于这个特点提出了一种基于兴趣视图子集的计算方法,依据用户历史查询信息确定兴趣视图子集与兴趣路径,同时定义了Stream-Tree结构用于在主存中物化存储兴趣视图子集所包含的数据单元,在运行过程中依据多层次时间窗口约束不断更新和维护Stream-Tree中存储的数据单元,而对于稀疏数据单元仅保留高层次的聚集值.实验和分析表明,该方法能够在有限的主存空间中维持流立方体当前窗口内的数据单元,同时能够支持快速更新维护存储结构和响应用户查询. 相似文献
4.
n维的立方体将生成2n个聚集立方体.如何进行立方体计算,在存储空间和查询时间方面寻求平衡,成为多维分析应用中的关键问题.基于部分物化的策略,并结合水利普查数据特征,改进Minimal cubing方法,提出了层次维编码片段方法HDEF cubing.该方法利用编码长度较小的层次维编码及其前缀,快速检索出与查询关键字相匹配的层次维编码,减少了多表连接操作,从而提高查询效率.以水利普查数据为例,验证了改进的立方体计算方法能高效地对立方体进行存储和查询,适用于水利普查成果分析. 相似文献
5.
6.
7.
8.
缓存敏感的封闭冰山立方体计算 总被引:1,自引:0,他引:1
数据立方体计算通常会产生大量的输出结果,冰山立方体和封闭立方体是解决这个问题的比较流行的两种策略,二者可以结合使用.鉴于封闭冰山立方体(closed iceberg cube)的重要性和实用性,如何高效地计算封闭冰山立方体是一个值得研究的问题.提出一种缓存敏感(cache-conscious)的计算封闭冰山立方体的方法,在自底向上对数据进行聚集的同时,寻找覆盖聚集单元的封闭单元,将其输出,使用两种策略进行剪枝,去掉不必要的递归,同时使用Apriori剪枝技术,支持冰山立方体(iceberg cube)的计算.为了减少与内存相关的延迟,快速得到聚集结果,对多个维进行预排序,并将软件预取技术引入到数据扫描中.在模拟数据和真实数据上进行了详细而全面的实验研究,结果表明,封闭冰山立方体的计算方法是快速、有效的. 相似文献
9.
为了解决大容量物理存储条件下数据仓库的物化视图选择问题,提出一种面向查询集覆盖的物化视图选择算法.首先给出了一些概念和定义,然后从视图集的多维数据格中抽取和裁剪出候选视图集,并定义视图物化的效益模型,最后在存储容量的限制下逐步淘汰收益最小的应答查询的冗余视图,得到覆盖所有查询的最优物化视图集.实验结果表明,该算法在较大物理存储条件下的物化视图选择效率优于以往算法,且能够消除物化视图在应答查询时存在的时延“抖动”现象,应答用户查询的平均时间也大为缩短. 相似文献
10.
数据仓库中物化视图选择策略 总被引:2,自引:0,他引:2
为了提高决策支持和OLAP查询的响应效率,数据仓库多采用物化视图的思想.因此,物化视图的选择策略是数据仓库研究的重要问题之一.其目标是选出一组存储、维护代价与查询代价的总和为最小的物化视图.提出一个以MVPP(multi-view processing plan)为视图选择的搜索空间的物化视图选择新算法--VSMF(views selection base on multi-factor)算法.该算法在存储空间约束下同时实现多查询最优化和视图维护最优化. 相似文献
11.
Xing Li Howard J. Hamilton Kamran Karimi Liqiang Geng 《Journal of Intelligent Information Systems》2009,33(2):179-208
The computation of data cubes is one of the most expensive operations in on-line analytical processing (OLAP). To improve
efficiency, an iceberg cube represents only the cells whose aggregate values are above a given threshold (minimum support).
Top-down and bottom-up approaches are used to compute the iceberg cube for a data set, but both have performance limitations.
In this paper, a new algorithm, called Multi-Tree Cubing (MTC), is proposed for computing an iceberg cube. The Multi-Tree
Cubing algorithm is an integrated top-down and bottom-up approach. Overall control is handled in a top-down manner, so MTC
features shared computation. By processing the orderings in the opposite order from the Top-Down Computation algorithm, the
MTC algorithm is able to prune attributes. The Bottom Up Computation (BUC) algorithm and its variations also perform pruning
by relying on the processing of intermediate partitions. The MTC algorithm, however, prunes without processing such partitions.
The MTC algorithm is based on a specialized type of prefix tree data structure, called an Attribute–Partition tree (AP-tree),
consisting of attribute and partition nodes. The AP-tree facilitates fast, in-memory sorting and APRIORI-like pruning. We
report on five series of experiments, which confirm that MTC is consistently as fast or faster than BUC, while finding the
same iceberg cubes. 相似文献
12.
Computing Iceberg Cubes by Top-Down and Bottom-Up Integration: The StarCubing Approach 总被引:1,自引:0,他引:1
Dong Xin Jiawei Han Xiaolei Li Zheng Shao Wah B.W. 《Knowledge and Data Engineering, IEEE Transactions on》2007,19(1):111-126
Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down versus bottom-up. The former, represented by the multiway array cube (called the multiway) algorithm, aggregates simultaneously on multiple dimensions; however, it cannot take advantage of a priori pruning when computing iceberg cubes (cubes that contain only aggregate cells whose measure values satisfy a threshold, called the iceberg condition). The latter, represented by BUC, computes the iceberg cube bottom-up and facilitates a priori pruning. BUC explores fast sorting and partitioning techniques; however, it does not fully explore multidimensional simultaneous aggregation. In this paper, we present a new method, star-cubing, that integrates the strengths of the previous two algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-bys that do not satisfy the iceberg condition. Our performance study shows that star-cubing is highly efficient and outperforms the previous methods 相似文献
13.
提出利用Cube中的维层次聚集树(dimension hierarchy aggregate tree,简称DHA-Tree)来对聚集Cube进行增量更新维护,在维层次聚集Cube中进行数据插入和删除等数据更新时,充分利用维层次聚集树中的维层次前缀,由下向上用更新前后的差值对受到更新结点影响的所有祖先结点进行增量更新.在插入新维数据时,在不需要重新构建聚集Cube就可以对聚集Cube进行增量更新,从而减少了Cube的更新时间.对基于维层次聚集树的聚集Cube与传统Cube进行了算法性能分析和比较,结果表明本文所提出的聚集Cube的增量更新算法性能最佳. 相似文献
14.
Efficient Incremental Maintenance for Distributive and Non-Distributive Aggregate Functions 下载免费PDF全文
Data cube pre-computation is an important concept for supporting OLAP (Online Analytical Processing) and has been studied extensively. It is often not feasible to compute a complete data cube due to the huge storage requirement. Recently proposed quotient cube addressed this issue through a partitioning method that groups cube cells into equivalence partitions. Such an approach not only is useful for distributive aggregate functions such as SUM but also can be applied to the maintenance of holistic aggregate functions like MEDIAN which will require the storage of a set of tuples for each equivalence class. Unfortunately, as changes are made to the data sources, maintaining the quotient cube is non-trivial since the partitioning of the cube cells must also be updated. In this paper, the authors design incremental algorithms to update a quotient cube efficiently for both SUM and MEDIAN aggregate functions. For the aggregate function SUM, concepts are borrowed from the principle of Galois Lattice to develop CPU-efficient algorithms to update a quotient cube. For the aggregate function MEDIAN, the concept of a pseudo class is introduced to further reduce the size of the quotient cube, Coupled with a novel sliding window technique, an efficient algorithm is developed for maintaining a MEDIAN quotient cube that takes up reasonably small storage space. Performance study shows that the proposed algorithms are efficient and scalable over large databases. 相似文献
15.
《Knowledge and Data Engineering, IEEE Transactions on》2007,19(7):903-918
The iceberg cubing problem is to compute the multidimensional group-by partitions that satisfy given aggregation constraints. Pruning unproductive computation for iceberg cubing when nonantimonotone constraints are present is a great challenge because the aggregate functions do not increase or decrease monotonically along the subset relationship between partitions. In this paper, we propose a novel bound prune cubing (BP-Cubing) approach for iceberg cubing with nonantimonotone aggregation constraints. Given a cube over n dimensions, an aggregate for any group-by partition can be computed from aggregates for the most specific n--dimensional partitions (MSPs). The largest and smallest aggregate values computed this way become the bounds for all partitions in the cube. We provide efficient methods to compute tight bounds for base aggregate functions and, more interestingly, arithmetic expressions thereof, from bounds of aggregates over the MSPs. Our methods produce tighter bounds than those obtained by previous approaches. We present iceberg cubing algorithms that combine bounding with efficient aggregation strategies. Our experiments on real-world and artificial benchmark data sets demonstrate that BP-Cubing algorithms achieve more effective pruning and are several times faster than state-of-the-art iceberg cubing algorithms and that BP-Cubing achieves the best performance with the top-down cubing approach. 相似文献
16.
数据仓库系统中一种改进的维层次聚集Cube存储结构 总被引:3,自引:0,他引:3
提出利用Cube中的维层次(dimension hierarchy)聚集技术来创建高性能的维层次聚集Cube(dimension hierarchy aggregate cube,DHAC).充分利用DHAC已保存的维层次信息,对Cube中多维数据的查询和更新效率进行了优化,并且支持Cube的上探、下钻等语义操作.在DHAC中进行数据插入和删除等数据更新时,由下向上用更新前后的差值对受到更新结点影响的所有祖先结点进行增量更新.实现了在插入新维或维层次时不需要重新构建聚集Cube就可以实现Cube的模式更新.对维层次聚集Cube与传统Cube进行了算法性能分析和比较,理论分析和实验结果都表明,所提出的DHAC性能最佳. 相似文献
17.
Cui-PingLi Kum-HoeTung ShanWang 《计算机科学技术学报》2004,19(3):0-0
Data cube computation is a well-known expensive operation and has been studied extensively. It is often not feasible to compute a complete data cube due to the huge storage requirement. Recently proposed quotient cube addressed this fundamental issue through a partitioning method that groups cube cells into equivalent partitions. The effectiveness and efficiency of the quotient cube for cube compression and computation have been proved. However, as changes are made to the data sources, to maintain such a quotient cube is non-trivial since the equivalent classes in it must be split or merged. In this paper, incremental algorithms are designed to update existing quotient cube efficiently based on Galois lattice. Performance study shows that these algorithms are efficient and scalable for large databases. 相似文献
18.
为了提高冰山立方体的计算性能,本文提出一种基于位图索引改进的DPBUC_BI (Dynamic Pruning based BUC_BI)算法。该算法利用位图索引按列组织的特性重新定义BUC(Bottom-Up Computation)算法的分组操作,加快了数据的加载和查询;通过使用逻辑位运算实现聚合计算,提高了算法的计算性能。针对部分数据聚集现象增加动态剪枝策略,在保证算法正确性的情况下进一步提高了冰山立方体计算性能。最后将DPBUC_BI算法应用于机票结算数据的冰山立方体计算中,实验结果表明:该算法可以很好地提升计算性能,相对于经典BUC算法在时间性能上有一定的提高。 相似文献