首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
查询是数据库系统的主要负载,其效率决定了数据库性能的好坏。一个查询存在多种执行计划,当前,查询优化器只能按照数据库系统的配置参数,静态地为查询选择一个较优的执行计划。并行查询间存在复杂多变的资源争用,很难通过配置参数准确反映,而且同一执行计划在不同情景下的效率并不一致。并行查询下执行计划的选择需考虑查询间的相互影响——查询交互。基于此,提出了一种在并行查询下度量查询受查询交互影响大小的标准QIs。针对并行查询下查询执行计划的选择,还提出了一种动态地为查询选择执行计划的方法TRating,该方法通过比较查询组合中按不同执行计划执行的查询受查询交互影响的大小,选择受查询交互影响较小的执行计划作为该查询的较优执行计划。实验结果表明,TRating方法为查询选择较优执行计划的准确率达61%,相比查询优化器提高了25%;而且在为查询选择次优执行计划时,其准确率也高达69%。  相似文献   

2.
Some significant progress related to multidimensional data analysis has been achieved in the past few years, including the design of fast algorithms for computing datacubes, selecting some precomputed group-bys to materialize, and designing efficient storage structures for multidimensional data. However, little work has been carried out on multidimensional query optimization issues. Particularly the response time (or evaluation cost) for answering several related dimensional queries simultaneously is crucial to the OLAP applications. Recently, Zhao et al. first exploited this problem by presenting three heuristic algorithms. In this paper we first consider in detail two cases of the problem in which all the queries are either hash-based star joins or index-based star joins only. In the case of the hash-based star join, we devise a polynomial approximation algorithm which delivers a plan whose evaluation cost is $ O(n^{\epsilon }$) times the optimal, where n is the number of queries and is a fixed constant with . We also present an exponential algorithm which delivers a plan with the optimal evaluation cost. In the case of the index-based star join, we present a heuristic algorithm which delivers a plan whose evaluation cost is n times the optimal, and an exponential algorithm which delivers a plan with the optimal evaluation cost. We then consider a general case in which both hash-based star-join and index-based star-join queries are included. For this case, we give a possible improvement on the work of Zhao et al., based on an analysis of their solutions. We also develop another heuristic and an exact algorithm for the problem. We finally conduct a performance study by implementing our algorithms. The experimental results demonstrate that the solutions delivered for the restricted cases are always within two times of the optimal, which confirms our theoretical upper bounds. Actually these experiments produce much better results than our theoretical estimates. To the best of our knowledge, this is the only development of polynomial algorithms for the first two cases which are able to deliver plans with deterministic performance guarantees in terms of the qualities of the plans generated. The previous approaches including that of [ZDNS98] may generate a feasible plan for the problem in these two cases, but they do not provide any performance guarantee, i.e., the plans generated by their algorithms can be arbitrarily far from the optimal one. Received: July 21, 1998 / Accepted: August 26, 1999  相似文献   

3.
列的连接策略优化是列存储数据查询中的重要问题。现有的列存储系统中,列的连接存在策略单一,缺少优化处理,无法满足复杂查询等缺陷。针对这些问题,提出一种连接策略选择方法。该方法首先定义简单规则过滤代价过大的查询计划,生成候选查询计划树。进而根据动态Huffman树原理提出动态优化树算法,对候选查询计划树中的查询执行顺序进行改进。根据列存储数据的特点,候选计划中每个连接节点的执行策略被归纳为两种:串行连接和并行连接。在此基础上构建代价估计模型,集中针对这两种连接策略进行代价估计和策略选择,从而以较小的时间复杂度获得优化的查询执行策略。  相似文献   

4.
针对无线传感器网络中多个Top-k查询问题,提出了一种Top-k多查询处理的算法,对接收到的多个Top-k查询请求进行预处理,预处理依据是约束条件,得出两类不同的查询集合:单约束条件的多查询和多约束条件的多查询。针对单约束条件的多查询提出了ETOP算法,该算法首先对排在时间序列最前面的Top-k查询请求进行基于网内处理,然后把查询结果存入基站缓存,并把结果的最小值设定为阈值传输到各个节点,再根据后续查询请求的查询范围进行相应的查询,从而快速地获得Top-k查询结果。实验表明:Top-k多查询方法在能够很好地实现查询的同时,减少了无线传感器网络中的传输消耗和能量消耗。  相似文献   

5.
基于最小生成树的数据流窗口连接优化算法   总被引:1,自引:1,他引:0  
与传统关系数据库不同,数据流管理系统主要处理并发的连续查询.由于查询可能随时增删,所以其主要关注适合查询增删的并发连续查询优化,而不是单条查询优化.提出适合频繁增删查询环境下的数据流窗口连接优化算法.对于新注册的查询以类似最小生成树算法写出数据流的探测序列,然后在不更改其他查询探测序列顺序的情况下尽量合并,减少重复计算.注册或删除查询并不影响其他的查询计划,不需要执行繁琐的查询计划迁移.理论分析和实验证明,该算法简单,优化性能在可接受的范围内,尤其适合查询更新频率较高的系统.  相似文献   

6.
查询是数据库系统的主要负载,查询的执行效率直接影响着系统的性能。目前,由于查询交互(query interaction,QI)复杂多变,查询优化器不能准确地评估查询进入系统产生的影响,很难为并行查询选择较优执行计划。将查询的平均响应时间、平均执行时间、平均I/O时间和平均缓冲区命中率作为QI的特征参数,表示QI;提出多维度查询交互度量(multi-dimensional measurement of query interaction,MMQI)模型和执行计划选择(execution plan selection,EPS)模型,采用深度神经网络,在度量QI的基础上,把QI作为主要因素,为并行查询选择较优执行计划。考虑到查询执行计划是由一系列关系运算组成的,以及QI具有时域特性,MMQI采用双向长短期记忆神经网络(bidirectional long-short term memory,Bi-LSTM)度量QI,从查询执行计划提取特征作为输入,将QI特征参数的改变作为输出,预测查询采用不同执行计划进入系统后QI特征参数的改变;EPS把预测到的查询特征参数的改变作为查询交互特征(feature of query interaction,FQI),与查询候选执行计划特征(features of candidate plan,FCP)融合,作为另一个Bi-LSTM的输入,为查询动态地选择较优执行计划。在PostgreSQL上的实验表明,MMQI-EPS比查询优化器选择较优执行计划的平均准确率提高38.6个百分点。  相似文献   

7.
Evaluating refined queries in top-k retrieval systems   总被引:2,自引:0,他引:2  
In many applications, users specify target values for certain attributes/features without requiring exact matches to these values in return. Instead, the result is typically a ranked list of "top k" objects that best match the specified feature values. User subjectivity is an important aspect of such queries, i.e., which objects are relevant to the user and which are not depends on the perception of the user. Due to the subjective nature of top-k queries, the answers returned by the system to an user query often do not satisfy the users need right away, either because the weights and the distance functions associated with the features do not accurately capture the users perception or because the specified target values do not fully capture her information need or both. In such cases, the user would like to refine the query and resubmit it in order to get back a better set of answers. While there has been a lot of research on query refinement models, there is no work that we are aware of on supporting refinement of top-k queries efficiently in a database system. Done naively, each "refined" query can be treated as a "starting" query and evaluated from scratch. We explore alternative approaches that significantly improve the cost of evaluating refined queries by exploiting the observation that the refined queries are not modified drastically from one iteration to another. Our experiments over a real-life multimedia data set show that the proposed techniques save more than 80 percent of the execution cost of refined queries over the naive approach and is more than an order of magnitude faster than a simple sequential scan.  相似文献   

8.
基于编辑距离的字符串近似查询算法一般是先给定阈值k,然后计算那些与查询串的编辑距离小于或等于k的结果。但是对于近似子串查询,结果中有很多是交叠的,并且是无意义的,于是提出了一种局部最优化匹配的概念,只计算那些符合阈值条件,并且是局部最优的结果,这样不仅避免了结果的交叠,而且极大节省了时间开销。给出了支持局部最优化匹配的近似子串查询的定义,相应提出了一种基于gram索引的局部最优化近似子串查询算法,分析了子串近似匹配过程中的规律,研究了基于局部最优化匹配的边界限定和过滤策略,给出了一种过滤优化的局部最优化近似子串查询算法,提高了查询效率。  相似文献   

9.
针对大数据环境下,非交互式差分隐私无法准确提供及处理大量范围查询的问题,提出一种基于最大信息系数与机器学习的隐私保护数据查询模型。对原始数据集采用最大信息系数选出相关性低的数据作为训练样本集,然后结合差分隐私的并行组合性质对其进行分块划分得到隐私保护的训练样本集,最后应用线性回归算法训练样本集得到差分隐私保护预测模型,该模型隐私保护的方式回答当前提交和大量未知的查询。实验结果表明,所提出的模型在提升发布数据效用性的同时,也提高了查询处理的效率。  相似文献   

10.
查询优化是提高数据库性能的关键技术,针对数据库查询优化效率低的难题,提出一种多子群萤火虫算法的数据库查询优化方法(MG-FA)。首先将数据库查询计划左深树看作一个萤火虫,然后将萤火虫群分为多个子群,各子群最优萤火虫通过信息交流找到数据库查询最优计划,最后进行数据库查询优化实例分析。结果表明,MG-FA是解决数据库查询优化的有效途径,能够获得理想的数据库查询计划,具有实际意义。  相似文献   

11.
Our study introduces a novel distributed query plan refinement phase in an enhanced architecture of distributed query processing engine(DQPE) . Query plan refinement generates potentially efficient distributed query plan by reusable aggregate query shipping(RAQS) approach. The approach improves response time at the cost of pre-processing time. If the overheads could not be compensated by query results reusage,RAQS is no more favorable. Therefore a global cost estimation model is employed to get proper operators:RR Agg,R Agg,or R Scan. For the purpose of reusing results of queries with aggregate function in distributed query processing,a multi-level hybrid view caching(HVC) scheme is introduced. The scheme retains the advantages of partial match and aggregate query results caching. By our solution,evaluations with distributed TPC-H queries show significant improvement on average response time.  相似文献   

12.
This study proposes a method of in-network aggregate query processing to reduce the number of messages incurred in a wireless sensor network. When aggregate queries are issued to the resource-constrained wireless sensor network, it is important to efficiently perform these queries. Given a set of multiple aggregate queries, the proposed approach shares intermediate results among queries to reduce the number of messages. When the sink receives multiple queries, it should be propagated these queries to a wireless sensor network via existing routing protocols. The sink could obtain the corresponding topology of queries and views each query as a query tree. With a set of query trees collected at the sink, it is necessary to determine a set of backbones that share intermediate results with other query trees (called non-backbones). First, it is necessary to formulate the objective cost function for backbones and non-backbones. Using this objective cost function, it is possible to derive a reduction graph that reveals possible cases of sharing intermediate results among query trees. Using the reduction graph, this study first proposes a heuristic algorithm BM (standing for Backbone Mapping). This study also develops algorithm OOB (standing for Obtaining Optimal Backbones) that exploits a branch-and-bound strategy to obtain the optimal solution efficiently. This study tests the performance of these algorithms on both synthesis and real datasets. Experimental results show that by sharing the intermediate results, the BM and OOB algorithms significantly reduce the total number of messages incurred by multiple aggregate queries, thereby extending the lifetime of sensor networks.  相似文献   

13.
研究了采用网络距离的道路网上移动对象连续多范围查询处理技术。设计了道路网、移动对象和查询数据在内存中存储的数据模型。基于该数据模型提出了两种道路网上的移动对象连续多范围查询处理算法。其中,增量式范围查询算法(incremental range query algorithm,IRQA)通过使用扩张树和影响列表结构减少查询的重新计算;组范围查询算法(group range query algorithm,GRQA)利用同一路径上多查询的结果具有相关性这一特点减少查询的重新计算。实验结果表明GRQA算法在查询分布比较集中时性能较优,IRQA算法在查询均匀分布时性能较优,此外,两种算法均优于重新计算所有查询结果的原始算法。  相似文献   

14.
New applications of information systems need to integrate a large number of heterogeneous databases over computer networks. Answering a query in these applications usually involves selecting relevant information sources and generating a query plan to combine the data automatically. As significant progress has been made in source selection and plan generation, the critical issue has been shifting to query optimization. This paper presents a semantic query optimization (SQO) approach to optimizing query plans of heterogeneous multidatabase systems. This approach provides global optimization for query plans as well as local optimization for subqueries that retrieve data from individual database sources. An important feature of our local optimization algorithm is that we prove necessary and sufficient conditions to eliminate an unnecessary join in a conjunctive query of arbitrary join topology. This feature allows our optimizer to utilize more expressive relational rules to provide a wider range of possible optimizations than previous work in SQO. The local optimization algorithm also features a new data structure called AND-OR implication graphs to facilitate the search for optimal queries. These features allow the global optimization to effectively use semantic knowledge to reduce the data transmission cost. We have implemented this approach in the PESTO (Plan Enhancement by SemanTic Optimization) query plan optimizer as a part of the SIMS information mediator. Experimental results demonstrate that PESTO can provide significant savings in query execution cost over query plan execution without optimization  相似文献   

15.
基于粒子群算法的数据库查询优化   总被引:1,自引:0,他引:1  
研究粒子群算法在数据库查询优化中的应用问题。为了解决大型数据库信息检索困难、查询效率低的问题,提出了一种基于粒子群算法优化数据库查询技术方案。算法提出了一种数据库查询执行计划代价模型,主要包括了查询多链接次序以及副本的选择问题,准确定义了数据库查询执行代价,采用提出的粒子群算法来优化并求解该执行代价问题,从而使得分组数目更少、数据定位更精确。实例验证结果表明,通过属性表现和违规行为任何教师都可以被准确定位,减少了分组,为数据库查询提供了优化。  相似文献   

16.
We consider a workload of aggregate queries and investigate the problem of selecting materialized views that (1) provide equivalent rewritings for all the queries, and (2) are optimal, in that the cost of evaluating the query workload is minimized. We consider conjunctive views and rewritings, with or without aggregation; in each rewriting, only one view contributes to computing the aggregated query output. We look at query rewriting using existing views and at view selection. In the query-rewriting problem, we give sufficient and necessary conditions for a rewriting to exist. For view selection, we prove complexity results. Finally, we give algorithms for obtaining rewritings and selecting views.  相似文献   

17.
基于Greenplum数据库的查询优化   总被引:1,自引:0,他引:1  
邹承明  谢义  吴佩 《计算机应用》2018,38(2):478-482
针对分布式数据库查询效率随着数据规模的增大而降低的问题,以Greenplum分布式数据库为研究对象,从优化查询路径的角度提出一个基于代价的最优查询计划生成方法。首先,该方法设计一种有效的代价模型来估算查询代价;然后,采用并行最大最小蚁群算法来搜索具有最小查询代价的连接顺序,即最优连接顺序;最后,根据Greenplum数据库对查询计划中不同操作的默认最优选择得到最优查询计划。采用该方法在自主生成的数据集与事务处理性能理事会测试基准(TPC-H)的标准数据集上进行了多组实验。实验结果表明,所提出的优化方法能有效地搜索出最优解,获得最优的查询计划,从而提升Greenplum数据库的查询效率。  相似文献   

18.
19.
20.
Web数据集成系统基于QC模型的物化视图选择   总被引:2,自引:0,他引:2  
在Web数据集成系统中,物化视图能够有效地减少网络传输代价,提高系统的查询效率.如何选择查询进行物化,使得选中的查询满足集成层的空间限制,同时获取最大物化收益,成为集成系统中一个迫切需要解决的问题.传统方法没有考虑到海量XML查询之间的包含关系,其选择的物化视图中可能包含冗余的信息.针对上述问题,提出了①Web数据集成系统中海量查询集合的QC(query containment)模型,该模型能够捕捉查询之间最常见的包含关系;②基于QC模型的物化视图选择算法,算法考虑了物化视图选择相关的主要因素,包括查询提交的频率、空间代价、查询重写能力和查询结果的完备性,提出了查询位图的物化视图组织方式,从而获取更加合理的物化视图选择方案.实验结果证明了该方法的有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号