共查询到19条相似文献,搜索用时 109 毫秒
1.
Skyband查询是决策支持领域一类非常重要的查询.为了使数据库系统有效支持Skyband查询,必须解决Skyband基数估计的问题,即估计Skyband查询结果中包含的Skyband元素数,因为Skyband基数估计对于扩展数据库系统查询优化器的代价模型以便能够对Skyband查询进行优化非常重要.基于容斥原理的推广形式对Skyband基数进行理论分析并给出了时间和空间代价很小的对Skyband基数进行估计的算法.实验结果表明,该方法能够准确地对Skyband基数进行估计. 相似文献
2.
Skyband查询是决策支持领域一类非常重要的查询.为了使数据库系统有效支持Skyband查询,必须解决Skyband基数估计的问题,即估计Skyband查询结果中包含的Skyband元素数,因为Skyband基数估计对于扩展数据库系统查询优化器的代价模型以便能够对Skyband查询进行优化非常重要.基于容斥原理的推广形式对Skyband基数进行理论分析并给出了时间和空间代价很小的对Skyband基数进行估计的算法.实验结果表明,该方法能够准确地对Skyband基数进行估计. 相似文献
3.
基于R-Tree的空间查询代价模型研究 总被引:5,自引:0,他引:5
本文对基于R-Tree的空间查询代价模型进行了探讨,分析了Y.Theodoridis等提出的矩形密度模型^[2,3],利用其结果提出了代价估计的概率模型,并通过实验验证了概率模型的估计精确度较矩形密度模型有了显著的提高. 相似文献
4.
5.
并行查询优化器的目标是缩减庞大的计划搜索空间,获得优化的查询规划。为此,并行实时数据库PRTD-BASE查询优化器针对无共享结构(SN),充分考虑通信开销,采用两阶段 优化方法,依据代价估计模型先对查询树进行基于代价估计的顺序优化,然后利用启发式规则对顺序优化的查询计划进行并行化,充分利用了多处理机的并行性,获得了较快的查 询响应时间。 相似文献
6.
空间查询优化是空间应用的突破点,由于现有的关系优化不能适应空间数据的查询,因此空间系统必须具有自己的代价模型和优化器,为此,给出了一个空间查询优化的系统方案FQPro,并在对空间查询优化的几个阶段做了一般性探讨后,将重点放在代价模型、谓词代价计算和优化方案的代价计算上,尤其对基于R-树的低代价模型给予了详细介绍,另外,参照关系优化器,FQPro还定义了一套谓词代价公式和谓词选择性公式,并在此基础上定义了查询方案代价计算公式和算法。文章最后指出,代价模型和可扩展的体系机构是空间查询优化系统的发展方向。 相似文献
7.
基于聚类分解的高维度量空间索引B~ -Tree 总被引:2,自引:0,他引:2
为了提高索引性能,高维度量空间索引通常采用K-Means等聚类技术来获取数据的分布信息.但是,已知的工作需要根据经验来确定聚类参数,缺乏对聚类与查询性能之间关系的理论分析.提出了一种基于聚类分解的高维度量空间B~ -tree索引,通过聚类分解,对数据进行更细致的划分来减少查询的数据访问.对聚类与查询代价的关系进行了讨论,通过查询代价模型,给出了最小查询代价条件下的聚类分解数目等理论的计算方法.实验显示,提出的索引方法明显优于iDistance等度量空间索引,最优聚类分解数的估计接近实际最优查询时所需的聚类参数. 相似文献
8.
9.
为了提高索引性能,高维度量空间索引通常采用K-Means等聚类技术来获取数据的分布信息.但是,已知的工作需要根据经验来确定聚类参数,缺乏对聚类与查询性能之间关系的理论分析.提出了一种基于聚类分解的高维度量空间B+-tree索引,通过聚类分解,对数据进行更细致的划分来减少查询的数据访问.对聚类与查询代价的关系进行了讨论,通过查询代价模型,给出了最小查询代价条件下的聚类分解数目等理论的计算方法.实验显示,提出的索引方法明显优于iDistance等度量空间索引,最优聚类分解数的估计接近实际最优查询时所需的聚类参数. 相似文献
10.
空间信息处理和地理信息系统等领域的数据管理涉及到海量、高维空间数据对象的处理。本文针对传统数据索引结构在处理这类空间数据时所存在的内存使用过大、I/O消耗过多等问题,通过改进选择查询的代价模型,给出了基于PQR-tree的查询和代价模型,以提高空间数据查询的性能。提出了基于PQR-tree的三阶段并行查询的方法,分别在任务创建、分配、执行阶段进行优化。提出在任务创建和任务分配阶段应用于空间查询中过滤和精炼阶段的有效算法。测试表明,本文算法在处理各种不同分布类型数据集过程中有效降低了空间数据处理对时间和空间的代价和需求,并且并行机制下的代价模型在预测和评估方面也具有较好的精确度。 相似文献
11.
12.
在移动计算环境下,基于准确的操作代价估算结果来选择合适的连接查询处理模式,可以减少数据的传输量和移动设备的能量消耗。探讨了该环境下移动设备能量消耗的一个新的非对称特征,提出了一种操作代价估算方法,并从数据传输量和能量消耗两个方面对连接查询处理模式进行了代价估算和性能比较,提出了4个实用准则,以指导连接查询处理模式的选择。试验结果充分论证了估算方法和准则的正确性,且比现有同类估算模型和结论具有更加广泛的应用范围。 相似文献
13.
基于循环神经网络的数据库查询开销预测 总被引:1,自引:0,他引:1
数据库负载管理、性能调优中,开销预测模型是提高其效率的关键技术。首先,由于数据库系统的复杂性和计算机资源的竞争,很难精确地估计不同操作的开销。其次,由于查询计划结构的复杂性,现有研究更多使用笼统的查询信息,而很少利用查询计划中操作层面的信息,并依据这些信息来获得开销模型。另外,现有的研究大多没有真正预测查询的执行时间,而是预测了类似查询优化器中开销模型生成的开销。为了减少负载管理的复杂性,本文提出了基于循环神经网络的精细模型来预测查询开销,以查询计划中的操作行为和其实际运行时间作为特征提取的来源。特别地,考虑到查询计划结构的复杂性,本文采用一种特殊的循环神经网络,长短期记忆(Long-Short Term Memory,LSTM)。给一个特定的查询计划,在该计划实际执行之前,模型就能产生其预测的执行时间。这会比现有数据库的查询优化器产生的开销预估结果(任意单位)更具有参考性;也优于需要在执行开始之后才能预测的查询进度指示器。本文提出的这种创新方法来预测查询执行时间,可以用于解决数据库负载管理中的关键问题。通过实验验证,模型的正确率高于71%,一定程度上证明了方法的可行性。 相似文献
14.
With the rocket development of the Internet, WWW(World Wide Web), mobile computing and GPS (Global Positioning System) services, location-based services like Web GIS (Geographical Information System) portals are becoming more and more popular. Spatial keyword queries over GIS spatial data receive much more attention from both academic and industry communities than ever before. In general, a spatial keyword query containing spatial location information and keywords is to locate a set of spatial objects that satisfy the location condition and keyword query semantics. Researchers have proposed many solutions to various spatial keyword queries such as top-K keyword query, reversed kNN keyword query, moving object keyword query, collective keyword query, etc. In this paper, we propose a density-based spatial keyword query which is to locate a set of spatial objects that not only satisfies the query’s textual and distance condition, but also has a high density in their area. We use the collective keyword query semantics to find in a dense area, a group of spatial objects whose keywords collectively match the query keywords. To efficiently process the density based spatial keyword query, we use an IR-tree index as the base data structure to index spatial objects and their text contents and define a cost function over the IR-tree indexing nodes to approximately compute the density information of areas. We design a heuristic algorithm that can efficiently prune the region according to both the distance and region density in processing a query over the IR-tree index. Experimental results on datasets show that our method achieves desired results with high performance. 相似文献
15.
16.
17.
Adaptive processing of historical spatial range queries in peer-to-peer sensor networks 总被引:1,自引:0,他引:1
Alexandru Coman Joerg Sander Mario A. Nascimento 《Distributed and Parallel Databases》2007,22(2-3):133-163
We investigate the problem of processing historical queries on a sensor network. Since data is considered to have been already
collected at the sensor nodes, the main issue is exploring the spatial component of the query in order to minimize its cost
represented by the energy consumption. We assume queries can be issued at any network node, i.e., there is no central base
station and all nodes have only local knowledge of the network. On the one hand, a globally optimum query processing plan
is desirable but its construction is not possible due to the lack of global knowledge of the network. On the other hand, while
a simple network flooding is feasible, it is not a practical choice from a cost perspective. To address this problem we propose
a two-phase query processing strategy, where in the first phase a path from the query originator to the query region is found
and in the second phase the query is processed within the query region itself. This strategy is supported by analytical models
that are used to dynamically select the best processing strategy depending on the query specifics. Our extensive analytical
and experimental results show that our analytical models are accurate and that the two-phase strategy is better suited for
small to medium sized queries, being up to 10 times more cost effective than a typical network flooding. In addition, the
dynamic selection of a query processing technique proved itself capable of always delivering at least as good performance
as the most energy efficient strategy for all query sizes.
Research supported in part by NSERC Canada. 相似文献
18.
To meet users' growing needs for accessing pre-existing heterogeneous databases, a multidatabase system (MDBS) integrating multiple databases has attracted many researchers recently. A key feature of an MDBS is local autonomy. For a query retrieving data from multiple databases, global query optimization should be performed to achieve good system performance. There are a number of new challenges for global query optimization in an MDBS. Among them, a major one is that some local optimization information, such as local cost parameters, may not be available at the global level because of local autonomy. It creates difficulties for finding a good decomposition of a global query during query optimization. To tackle this challenge, a new query sampling method is proposed in this paper. The idea is to group component queries into homogeneous classes, draw a sample of queries from each class, and use observed costs of sample queries to derive a cost formula for each class by multiple regression. The derived formulas can be used to estimate the cost of a query during query optimization. The relevant issues, such as query classification rules, sampling procedures, and cost model development and validation, are explored in this paper. To verify the feasibility of the method, experiments were conducted on three commercial database management systems supported in an MDBS. Experimental results demonstrate that the proposed method is quite promising in estimating local cost parameters in an MDBS. 相似文献
19.
估算查询结果大小的直方图方法之研究 总被引:11,自引:0,他引:11
直方图是许多商用数据库系统中最常用的一种估算查询结果大小的方法.从实用的观点来看,过去已提出的一些直方图方法有局限性,主要是它们不能保证估算值的准确程度.本文将提出两种新的直方图方法,它们不仅使用方便,而且可以保证所有的估算值均在给定的误差范围内.此外,本文还探讨了不同的数据分布对直方图的影响,通过运用一些重要的参数刻画数据分布,用以帮助生成效果较佳的直方图. 相似文献