共查询到20条相似文献,搜索用时 125 毫秒
1.
Skyband查询是决策支持领域一类非常重要的查询.为了使数据库系统有效支持Skyband查询,必须解决Skyband基数估计的问题,即估计Skyband查询结果中包含的Skyband元素数,因为Skyband基数估计对于扩展数据库系统查询优化器的代价模型以便能够对Skyband查询进行优化非常重要.基于容斥原理的推广形式对Skyband基数进行理论分析并给出了时间和空间代价很小的对Skyband基数进行估计的算法.实验结果表明,该方法能够准确地对Skyband基数进行估计. 相似文献
2.
Skyband查询是决策支持领域一类非常重要的查询.为了使数据库系统有效支持Skyband查询,必须解决Skyband基数估计的问题,即估计Skyband查询结果中包含的Skyband元素数,因为Skyband基数估计对于扩展数据库系统查询优化器的代价模型以便能够对Skyband查询进行优化非常重要.基于容斥原理的推广形式对Skyband基数进行理论分析并给出了时间和空间代价很小的对Skyband基数进行估计的算法.实验结果表明,该方法能够准确地对Skyband基数进行估计. 相似文献
3.
4.
XML数据查询中值匹配查询代价估计算法 总被引:6,自引:0,他引:6
XML数据查询中值匹配查询条件的查询代价估计问题是一种典型的多元素查询条件代价估计问题.它与传统关系型数据库中的多元素查询条件不同,因为XML数据中的值信息分布不仅与其他值信息分布相关,还与XML数据中的结构信息相关,而且当XML数据结构比较复杂时,可能会形成高维元素相关.针对以上问题,提出了一种面向XML数据的基于小波的多维直方图查询代价估计算法,并提出了确定XML数据中以某值元素为主键的相互依赖元组的方法,将值匹配条件改写为多元素查询条件的方法以及结构信息的值化方法.实验结果证明,提出的方法取得了较准确的查询代价估计结果. 相似文献
5.
基数估计和代价估计可以引导执行计划的选择,估计准确性对查询优化器至关重要.然而,传统数据库的代价和基数估计技术无法提供准确的估计,因为现有技术没有考虑多个表之间的相关性.将人工智能技术应用于数据库(artificial intelligence for databases, AI4DB)近期得到广泛关注,研究结果表明,基于学习的估计方法优于传统方法.然而,现有基于学习的方法仍然存在不足:首先,大部分的方法只能估计基数,但忽略了代价估计;其次,这些方法只能处理一些简单的查询语句,对于多表查询、嵌套查询等复杂查询则无能为力;同时,对字符串类型的值也很难处理.为了解决上述问题,提出了一种基于树型门控循环单元, Tree-GRU (tree-gated recurrent unit)的基数和代价估计方法,可以同时对基数和代价进行估计.此外,采用了有效的特征提取和编码技术,在特征提取中兼顾查询和执行计划,将特征嵌入到Tree-GRU中.对于字符串类型的值,使用神经网络自动提取子串与整串的关系,并进行字符串嵌入,从而使具有稀疏性的字符串变得容易被估计器处理.在JOB、Synthetic等数据集上进... 相似文献
6.
7.
8.
基于P2P系统的动态负载均衡算法 总被引:1,自引:0,他引:1
在现实的P2P网络环境中,由于节点的计算能力和带宽等方面的异构性,网络负载不均衡现象非常突出.基于数据复制/转移策略,提出一种动态的平衡算法.根据节点的能力,当前节点负载状态、负载转移代价预估算,在整个系统范围内找到一组传输代价较小并且负载较轻的节点集合,从中随机选取较为适宜的节点进行负载转移或者数据复制.试验结果表明,该算法能够有效地均衡负载的分布以及降低负载的迁移率. 相似文献
9.
在对等计算应用中,副本复制技术是提升查询命中率、提升查询速度、维护负载均衡的一种有效方法,然而它也提升了在存储空间和流量上的代价.研究如何在结构化的P2P覆盖网中,通过拓扑优化手段减少复制中的冗余流量和冗余副本.首先在网络中选择支配集节点作为超级节点,设计一个层次化的、体现节点邻近度的P2P覆盖网,然后基于多hash函数,开发对应的复制技术以实现低代价的副本查询.该方法能够有效地在网络中分散副本,提升查询命中率,减少冗余消息和所需存储空间.给出了性能指标的理论分析,并通过仿真验证了该方法的优越性. 相似文献
10.
负载不均衡问题位列影响大规模MapReduce集群性能因素的首位,而Hive join查询非常容易触发该问题。通用解决方案是基于中间键值对的key频率分布设计能够实现负载均衡的key划分算法。现有工作估算key频率分布时依赖于对map的输出进行监控采样,使得通信开销较大并显著延后了shuffle的启动。针对Hive join查询,提出了基于ORC元数据的key频率分布估计方法和相应的负载均衡key划分方法。该方法具有计算量小、通信开销小、不影响现有shuffle机制的优点。通过基准测试证明了该方法在key频率分布估算效率上的巨大提升及相应的key划分方法对Hive join查询性能的提升。 相似文献
11.
空间连接查询是最耗时,最重要的空间查询、空间多路连接是涉及多个空间关系的连接查询,顺序空间连接查询的效率还是不能令人满意,研究利用并行机制提高空间连接查询效率成为有吸引力的方向,并行空间连接处理由三个阶段组成;任务创建,任务分配和任务并行执行,本文提出一种新的平面扫描方法用于多路并行处理的任务创建过程,随机提出基于花费估计的动态任务分配策略,给出了花费模型,并将其推到处理多路并行连接查询处理以实现负荷平衡。 相似文献
12.
13.
Shiyuan Wang Quang Hieu Vu Beng Chin Ooi Anthony K. H. Tung Lizhen Xu 《The VLDB Journal The International Journal on Very Large Data Bases》2009,18(1):345-362
This paper looks at the processing of skyline queries on peer-to-peer (P2P) networks. We propose Skyframe, a framework for
efficient skyline query processing in P2P systems, which addresses the challenges of quick response time, low network communication
cost and query load balancing among peers. Skyframe consists of two querying methods: one is optimized for network communication
while the other focuses on query response time. These methods are different in the way in which the query search space is
defined. In particular, the first method uses a high dominating point that has a large dominating region to prune the search
space to achieve a low cost in network communication. On the other hand, the second method relaxes the search space in order
to allow parallel query processing to speed up query response. Skyframe achieves query load balancing by both query load conscious
data space splitting/merging during the join/departure of nodes and dynamic load migration. We further show how to apply Skyframe
to both the P2P systems supporting multi-dimensional indexing and the P2P systems supporting single-dimensional indexing.
Finally, we have conducted extensive experiments on both real and synthetic data sets over two existing P2P systems: CAN (Ratnasamy
in A scalable content-addressable network. In: Proceedings of SIGCOMM Conference, pp. 161–172, 2001) and BATON (Jagadish et
al. in A balanced tree structure for peer-to-peer networks. In: Proceedings of VLDB Conference, pp. 661–672, 2005) to evaluate
the effectiveness and scalability of Skyframe. 相似文献
14.
15.
Shen Haiying Xu Cheng-Zhong 《Parallel and Distributed Systems, IEEE Transactions on》2010,21(2):242-256
Consistent hashing-based DHT networks have an inherent load balancing problem. The problem becomes more severe in heterogeneous networks with nonuniform and time-varying popular files. Existing DHT load balancing algorithms are mainly focused on the issues caused by node heterogeneity. To deal with skewed lookups, this paper presents an elastic routing table (ERT) mechanism for query load balancing, based on the observation that high-degree nodes tend to receive more traffic load. The mechanism allows each node to have a routing table of variable size corresponding to node capacities. The indegree and outdegree of the routing table can also be adjusted dynamically in response to the change of file popularity and network churn. Theoretical analysis proves that the routing table degree is bounded. The ERT mechanism facilitates locality-aware randomized query forwarding to further improve lookup efficiency. By relating query forwarding to a supermarket customer service model, we prove that a two-way randomized query forwarding policy should lead to an exponential improvement in query processing time over random walking. Simulation results demonstrate the effectiveness of the ERT mechanism and its related query forwarding policy for congestion and query load balancing. In comparison with existing "virtual-server”-based load balancing algorithms and other routing table control approaches, the ERT-based congestion control protocol yields significant improvement in query lookup efficiency. 相似文献
16.
针对任务调度中存在的任务完成时间长、系统执行任务成本高且系统负载不均衡等问题,提出了一种基于正交自适应鲸鱼优化算法(OAWOA)的云计算任务调度方法。首先,将正交试验设计(OED)应用于种群初始化和全局搜索阶段,以提升和维持种群的多样性,避免算法过早陷入局部收敛状态;然后,利用自适应指数递减因子和双向搜索机制,来进一步加强算法的全局搜索能力;最后,对适应度函数进行优化,从而使算法实现多目标优化。通过仿真实验将所提的算法与鲸鱼优化算法(WOA)、粒子群优化(PSO)算法、蝙蝠算法(BA)以及其他两种改进的WOA进行比较。实验结果表明,在任务规模为50和500时所提算法都取得了更好的收敛效果,并且得到的系统执行任务的总时间和总成本均低于其他几种算法,同时负载均衡度仅低于BA。可见,所提算法在降低系统执行任务的总时间和总成本以及提高系统负载均衡方面均表现出了显著的优势。 相似文献
17.
《Journal of Parallel and Distributed Computing》1995,25(1):42-57
Although load balancing incurs processing costs, and therefore can have a profound influence on the optimized execution plan of a query, none of the existing parallelizing query optimizers consider this factor. In this paper, we address this issue by introducing the cost of load balancing as a new factor for query optimization. Specifically, we implemented three new optimizers for multiway join queries that take the load balancing issue into consideration. To evaluate the efficiency of these schemes, we also implemented a simulator for the parallel execution of multiway joins. To provide more faith, our simulation model was validated by comparing the simulation results to those produced by the actual implementation of the same algorithms running on a multicomputer system. This simulator was used in our study to compare the new techniques to a more conventional system in which load balancing is performed at runtime, but it is not a factor for query optimization. Our extensive simulation results confirm that the new methods, indeed, provide very significant savings. Most interestingly, the best scheme displays a performance which is essentially immune from the skew effect. Furthermore, we observed that these new optimizers can consistently achieve the same level of performance gain regardless of the CPU power, I/O, and communication capabilities of the computing system. This indicates that our approaches are generally useful for all hardware platforms. 相似文献
18.
基于剩余计算能力的动态负载均衡系统是一种基于新型负载向量的动态负载均衡系统。该系统使用一种新的负载评价指标:剩余计算能力,它兼顾节点的资源使用情况及节点本身的性能特征两个方面,更好地体现了集群系统的处理能力和系统正在处理的负载情况,比常用的其它负载向量更加灵活、准确。系统还将任务调度和进程迁移结合起来,以达到更有效的系统负载均衡,同时,也减小系统负载均衡带来的额外开销。 相似文献
19.
针对无线传感器网络节点数量多、通信距离短、能量有限的特点,提出一种查询增益路由算法以及基于路由的负载均衡机制。查询增益路由算法通过查询增益矩阵维护路由信息,并依据历史查询成功记录来选取路由节点;而基于路由的负载均衡机制可以在查询路由过程中记录节点的能量信息,转移负载,使得查询路径中各节点的能量消耗得到均衡。仿真实验结果表明,查询增益路由算法可以在降低节点能量消耗的前提下提高查询成功率,而基于路由的负载均衡机制可以进一步降低查询增益路由算法的能量消耗。 相似文献
20.
In this paper, we propose an intelligent distributed query processing method considering the characteristics of a distributed ontology environment. We suggest more general models of the distributed ontology query and the semantic mapping among distributed ontologies compared with the previous works. Our approach rewrites a distributed ontology query into multiple distributed ontology queries using the semantic mapping, and we can obtain the integrated answer through the execution of these queries. Furthermore, we propose a distributed ontology query processing algorithm with several query optimization techniques: pruning rules to remove unnecessary queries, a cost model considering site load balancing and caching, and a heuristic strategy for scheduling plans to be executed at a local site. Finally, experimental results show that our optimization techniques are effective to reduce the response time. 相似文献