期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An evaluation of buffer management strategies for relational database systems

Hong -Tai Chou David J. DeWitt 《Algorithmica》1986,1(1-4):311-336

In this paper we present a new algorithm, DBMIN, for managing the buffer pool of a relational database management system. DBMIN is based on a new model of relational query behavior, thequery locality set model (QLSM). Like the hot set model, the QLSM has an advantage over the stochastic models due to its ability to predict future reference behavior. However, the QLSM avoids the potential problems of the hot set model by separating the modeling of reference behavior from any particular buffer management algorithm. After introducing the QLSM and describing the DBMIN algorithm, we present a performance evaluation methodology for evaluating buffer management algorithms in a multiuser environment. This methodology employed a hybrid model that combines features of both trace-driven and distribution-driven simulation models. Using this model, the performance of the DBMIN algorithm in a multiuser environment is compared with that of the hot set algorithm and four more traditional buffer replacement algorithms. 相似文献

2.

The design and evaluation of a buffer algorithm for database machines

F. F. Cai M. E. C. Hull D. A. Bell 《Concurrency and Computation》1998,10(15):1291-1316

This paper reports on the design, development and evaluation of a buffer algorithm named RESBAL, which exploits parallelism in order to provide high query execution performance in relational database systems. The algorithm aims to provide both predictive and efficient data buffering by exploiting the use of advance knowledge of query reference behaviour. Designed to offer a high level of flexibility, RESBAL employs a multiple buffering strategy both on page fetch level and page replacement level in order to improve buffer performance. The evaluation of RESBAL has been carried out in a parallel database system environment based on a transputer architecture. The results of this performance assessment allow comparison to be made between different buffer algorithms, and demonstrate the feasibility and effectiveness of the RESBAL algorithm. © 1998 John Wiley & Sons, Ltd. 相似文献

3.

Seeking the truth about ad hoc join costs

Laura M. Haas Michael J. Carey Miron Livny Amit Shukla 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(3):241-256

In this paper, we re-examine the results of prior work on methods for computing ad hoc joins. We develop a detailed cost model for predicting join algorithm performance, and we use the model to develop cost formulas for the major ad hoc join methods found in the relational database literature. We show that various pieces of “common wisdom” about join algorithm performance fail to hold up when analyzed carefully, and we use our detailed cost model to derive op timal buffer allocation schemes for each of the join methods examined here. We show that optimizing their buffer allocations can lead to large performance improvements, e.g., as much as a 400% improvement in some cases. We also validate our cost model's predictions by measuring an actual implementation of each join algorithm considered. The results of this work should be directly useful to implementors of relational query optimizers and query processing systems. Edited by M. Adiba. Received May 1993 / Accepted April 1996 相似文献

4.

An effective approach to vertical partitioning for physical designof relational databases

Cornell D.W. Yu P.S. 《IEEE transactions on pattern analysis and machine intelligence》1990,16(2):248-258

Vertical partitioning can be used to enhance the performance of relational database systems by reducing the number of disk accesses. The authors identify the key parameters for capturing the behavior of an access plan and propose a two-step methodology consisting of a query analysis step to estimate the parameters and a binary partitioning step which can be applied recursively. The partitioning uses an integer linear programming technique to minimize the number of disk accesses. Significant performance benefit would be achieved for join if the partitioned (inner) relation could fit into the memory buffer under the inner-outer loop join method, or if the partitioned relation could fit into the sort buffer under the sort-merge join method, but not the original relation. For cases where a segment scan or a cluster index scan is used, vertical partitioning of the relation with the algorithm described is still often found to lead to substantial performance improvement 相似文献

5.

Tight bounds for prefetching and buffer management algorithms forparallel I/O systems

Varman P.J. Verma R.M. 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(12):1262-1275

The I/O performance of applications in multiple-disk systems can be improved by overlapping disk accesses. This requires the use of appropriate prefetching and buffer management algorithms that ensure the most useful blocks are accessed and retained in the buffer. In this paper, we answer several fundamental questions on prefetching and buffer management for distributed-buffer parallel I/O systems. First, we derive and prove the optimality of an algorithm, P-min, that minimizes the number of parallel I/Os. Second, we analyze P-con, an algorithm that always matches its replacement decisions with those of the well-known demand-paged MIN algorithm. We show that P-con can become fully sequential in the worst case. Third, we investigate the behavior of on-line algorithms for multiple-disk prefetching and buffer management. We define and analyze P-Iru, a parallel version of the traditional LRU buffer management algorithm. Unexpectedly, we find that the competitive ratio of P-Iru is independent of the number of disks. Finally, we present the practical performance of these algorithms on randomly generated reference strings. These results confirm the conclusions derived from the analysis on worst case inputs 相似文献

6.

Efficiently instantiating view-objects from remote relational databases

Byung Suk Lee Ph.D. Gio Wiederhold Ph.D. 《The VLDB Journal The International Journal on Very Large Data Bases》1994,3(3):289-323

View-objects are complex objects that are instantiated by delivering a query to a database and converting the query result into a nested structure. In relational databases, query results are conventionally retrieved as a single flat relation, which contains duplicate subtuples in its composite tuples. These duplicate subtuples increase the amount of data to be handled and thus degrade performance. In this article, we describe two new methods that retrieve a query result in structures other than a single flat relation. One method retrieves a set of relation fragments, and the other retrieves a single-nested relation. We first describe their algorithms and cost models, and then present the cost comparison results in a client-server architecture with a relational main memory database residing on a server. 相似文献

7.

The aditi deductive database system 总被引：2，自引：0，他引：2

Jayen Vaghanl BSc Kotagiri Ramamohanarao Ph.D. David B. Kemp Ph.D. Zoltan Somogyi Ph.D. Peter J. Stuckey Ph.D. Tim S. Leask BSc James Harland Ph.D. 《The VLDB Journal The International Journal on Very Large Data Bases》1994,3(2):245-288

Deductive databases generalize relational databases by providing support for recursive views and non-atomic data. Aditi is a deductive system based on the client-server model; it is inherently multi-user and capable of exploiting parallelism on shared-memory multiprocessors. The back-end uses relational technology for efficiency in the management of disk-based data and uses optimization algorithms especially developed for the bottom-up evaluation of logical queries involving recursion. The front-end interacts with the user in a logical language that has more expressive power than relational query languages. We present the structure of Aditi, discuss its components in some detail, and present performance figures. 相似文献

8.

XML查询结构连接顺序选择算法分析与优化

张艺濒谢金晶《微机发展》2007,17(1):82-84

如今对XML查询的优化是对XML的热点研究方向。其中的结构连接操作是XML数据库查询的主要操作。和关系数据库中的连接运算一样,结构连接顺序的选择是XML数据库查询优化的核心。文中主要通过对XML查询优化中各种选择连接顺序算法的研究,提出了一种优化的算法,在规模较大的XML查询中能够有效缩减搜索空间,提高效率。相似文献

9.

Optimal management of dynamic buffer caches

Andrew Choi Manfred Ruschitzka 《Performance Evaluation》1996,26(4):239-262

Optimal memory management strategies such as VMIN are generally considered unrealizable in view of the impracticality of obtaining a computation's reference string prior to execution. Addressing such strategies, this paper focuses on the dynamic management of variable-size buffer caches in the framework of the locality-set model, a memory management model that characterizes the reference behavior of a computation in terms of locality sets rather than reference strings. Several cost measures—the number of page faults, the space-time product, and one that combines them—are considered and conditions under which they are equivalent are derived. We define two novel strategies, PSETVMIN and SETVMIN, which manage buffer caches with and without prepaging, respectively, and prove that they minimize a cost measure that takes both page faults and the space-time product into account. The two strategies are of theoretical interest in view of their optimal behavior, but—more importantly—they are also realizable since only a limited amount of information about the reference behavior of a computation, the locality-set sequence, is required in advance. We demonstrate the use of these strategies for join processing in relational database management systems. The performance benefits of this technique are discussed and illustrated by simulation results. 相似文献

10.

An annotation management system for relational databases

Deepavali Bhagwat Laura Chiticariu Wang-Chiew Tan Gaurav Vijayvargiya 《The VLDB Journal The International Journal on Very Large Data Bases》2005,14(4):373-396

We present an annotation management system for relational databases. In this system, every piece of data in a relation is assumed to have zero or more annotations associated with it and annotations are propagated along, from the source to the output, as data is being transformed through a query. Such an annotation management system could be used for understanding the provenance (aka lineage) of data, who has seen or edited a piece of data or the quality of data, which are useful functionalities for applications that deal with integration of scientific and biological data. We present an extension, pSQL, of a fragment of SQL that has three different types of annotation propagation schemes, each useful for different purposes. The default scheme propagates annotations according to where data is copied from. The default-all scheme propagates annotations according to where data is copied from among all equivalent formulations of a given query. The custom scheme allows a user to specify how annotations should propagate. We present a storage scheme for the annotations and describe algorithms for translating a pSQL query under each propagation scheme into one or more SQL queries that would correctly retrieve the relevant annotations according to the specified propagation scheme. For the default-all scheme, we also show how we generate finitely many queries that can simulate the annotation propagation behavior of the set of all equivalent queries, which is possibly infinite. The algorithms are implemented and the feasibility of the system is demonstrated by a set of experiments that we have conducted. 相似文献

11.

关系数据库中支持语义的Top-K关键字搜索

王斌杨晓春王国仁《软件学报》2008,19(9):2362-2375

为了增强关系数据库中的关键字搜索查询结果,考虑了多表之间以及元组之间的语义关系,提出了一种语义评分函数.该语义评分函数不仅涵盖了当前的评分思想,并且加入新指标来衡量查询结果与查询关键字之间的相关性.基于该评分函数,提出两种以数据块为处理单位的Top-K搜索算法,分别为BA(blocking algorithm)算法和EBA(early-stopping blocking algorithm)算法.EBA在BA基础上引入了过滤域值,以便尽早终止算法的迭代次数.最后实验结果显示语义评分函数保证了相似文献

12.

MySQL缓冲区自适应管理仿真研究

王楠吴云《计算机应用研究》2023,40(4):1154-1159

由于MySQL使用配置参数的方式调节线性预读的阈值以及冷热LRU算法的冷热比例,导致缓冲区存在性能瓶颈。针对以上问题,提出一种缓冲区自适应管理的方法,该方法通过遗憾最小化的强化在线学习技术设计了自适应阈值调整算法以及自适应冷热缓存替换算法。首先,对MySQL中的预读算法以及冷热缓存替换算法进行深入研究,明确了预读阈值以及冷热比例大小对两种算法的具体影响;其次,通过FIFO历史队列以及增加辅助字段的方式,设计了一套参数评估流程,实时评估当前参数是偏大或偏小;最后,设计了一种参数调整模型,该模型利用MySQL原生的预读算法以及缓存替换算法的性能监控指标,实现对参数的合理调整。在FIU数据集上进行了900组仿真实验,实验表明,相较于MySQL原生的基准预读算法以及冷热缓存算法,自适应后的两种算法能够在基本不牺牲算法运行速度的基础上,有效减少8%的磁盘I/O以及增加24%的缓存命中率;相对于最新的缓存替换算法,自适应后的冷热缓存替换算法在保证缓存命中率的前提下,将速度提升至1.6倍。相似文献

13.

面向关系数据库的智能索引调优方法

邱涛王斌舒昭维赵智博宋子文钟延辉《软件学报》2020,31(3):634-647

数据库索引是关系数据库系统实现快速查询的有效方式之一.智能索引调优技术可以有效地对数据库实例进行索引调节,从而保持数据库高效的查询性能.现有的方法大多利用了数据库实例的查询日志,它们先从查询日志中得到候选索引,再利用人工设计的模型选择索引,从而调节索引.然而,从查询日志中产生出的候选索引可能并未实际存在于数据库实例中,因此导致这些方法不能有效地估计这类索引对于查询的优化效果.首先,设计并实现了一种面向关系数据库的智能索引调优系统;其次,提出了一种利用机器学习方法来构造索引的量化模型,根据该模型,可以准确地对索引的查询优化效果进行估计;接着设计了一种高效的最优索引选择算法,实现快速地从候选索引空间中选择满足给定大小约束的最优的索引组合;最后,通过实验测试不同场景下智能索引调优系统的调优性能.实验结果表明,所提出的技术可以在不同的场景下有效地对索引进行优化,从而实现数据库系统查询性能的提升. 相似文献

14.

一种鲁棒性的少样本学习方法

代磊超冯林杨玉亭尚兴林苏菡《小型微型计算机系统》2021,(2):340-347

少样本学习是目前机器学习研究领域的一个热点,它能在少量的标记样本中学习到较好的分类模型.但是,在噪声的不确定环境中,传统的少样本学习模型泛化能力弱.针对这一问题,提出一种鲁棒性的少样本学习方法RFSL(Robust Few-Shot Learning).首先,使用核密度估计(Kernel Density Estimat... 相似文献

15.

基于三级存储器的Join算法 总被引：2，自引：0，他引：2

李建中张冬冬张艳秋《软件学报》2003,14(5):947-954

研究了基于三级存储器的海量关系数据库的Join算法.目前,在所有磁带数据Join算法中,基于Hash思想的算法是最优的.但是,这些算法没有考虑从第三级存储器中读取数据时,磁带定位时间对算法性能的影响.磁带的磁头随机定位耗时大,是影响基于三级存储器的数据操作算法时间复杂性的关键因素.针对这个问题,提出了两种新的基于三级存储器的海量关系数据库连接算法,即Disk-Based-Hash-Join算法和Tertiary-Only-Hash-Join算法.这两种算法采用了磁盘缓冲技术和散列数据集中存储方法,降低了算法的磁带磁头随机定位时间复杂性,提高了基于三级存储器的连接算法的性能.理论分析和实验结果表明,提出的基于三级存储器连接算法的性能高于目前所有同类算法的性能,可以有效地应用于海量数据管理系统. 相似文献

16.

VDBMS: A testbed facility for research in video database benchmarking

Walid?Aref Ann?Christine?Catlin Email author Ahmed?Elmagarmid Jianping?Fan Moustafa?Hammad Ihab?Ilyas Mirette?Marzouk Sunil?Prabhakar Yi-Cheng?Tu Xingquan?Zhu 《Multimedia Systems》2004,9(6):575-585

相似文献

17.

一种适应GPU的混合OLAP查询处理模型

张宇张延松陈红王珊《软件学报》2016,27(5):1246-1265

通用GPU因其强大的并行计算能力成为新兴的高性能计算平台,并逐渐成为近年来学术界在高性能数据库实现技术领域的研究热点.但当前GPU数据库领域的研究沿袭的是ROLAP(relational OLAP)多维分析模型,研究主要集中在关系操作符在GPU平台上的算法实现和性能优化技术,以哈希连接的GPU并行算法研究为中心.GPU拥有数千个并行计算单元,但其逻辑控制单元较少,相对于CPU具有更强的并行计算能力,但逻辑控制和复杂内存管理能力较弱,因此并不适合需要复杂数据结构和复杂内存管理机制的内存数据库查询处理算法直接移植到GPU平台.提出了面向GPU向量计算特性的混合OLAP多维分析模型semi-MOLAP,将MOLAP(multidimensionalOLAP)模型的直接数组访问和计算特性与ROLAP模型的存储效率结合在一起,实现了一个基于完全数组结构的GPU semi-MOLAP多维分析模型,简化了GPU数据管理,降低了GPU semi-MOLAP算法复杂度,提高了GPU semi-MOLAP算法的代码执行率.同时,基于GPU和CPU计算的特点,将semi-MOLAP操作符拆分为CPU和GPU平台的协同计算,提高了CPU和GPU的利用率以及OLAP的查询整体性能. 相似文献

18.

主存OLAP系统中what-if查询处理策略

张延松肖艳芹王珊陈红《软件学报》2010,21(10):2494-2512

What-If分析能够提供比传统的OLAP(on-line analysis processing)分析更加有意义的决策支持信息.基于历史数据的应用场景假设分析需要更加有效的what-if数据视图生成机制的支持.在传统的delta表合并算法的基础上,提出了基于内存记录指针的deltaMap算法来提高what-if数据视图的合并性能.根据OLAP分析的应用特点,提出了pre-merge算法来处理支持分布式计算的聚集运算.根据不同的假设更新类型,对查询重写算法和△cube算法作了详细的性能测试并进行了全面的性能分析对比,在此基础上提出了what-if分析的代价模型,以应用场景模式、假设更新率、假设更新复杂度、查询结果集的基数作为参数,有效地描述系统what-if查询处理策略,为what-if分析的解决方案提供了一个可行的框架结构. 相似文献

19.

The Threshold Algorithm: From Middleware Systems to the Relational Engine

Bruno N. Hui Wang 《Knowledge and Data Engineering, IEEE Transactions on》2007,19(4):523-537

The answer to a top-k query is an ordered set of tuples, where the ordering is based on how closely each tuple matches the query. In the context of middleware systems, new algorithms to answer top-k queries have been recently proposed. Among these, the threshold algorithm (TA) is the most well-known instance due to its simplicity and memory requirements. TA is based on an early-termination condition and can evaluate top-k queries without examining all the tuples. This top-k query model is prevalent not only over middleware systems, but also over plain relational data. In this work, we analyze the challenges that must be addressed to adapt TA to a relational database system. We show that, depending on the available indices, many alternative TA strategies can be used to answer a given query. Choosing the best alternative requires a cost model that can be seamlessly integrated with that of current optimizers. In this work, we address these challenges and conduct an extensive experimental evaluation of the resulting techniques by characterizing which scenarios can take advantage of TA-like algorithms to answer top-k queries in relational database systems 相似文献

20.

障碍空间中基于Voronoi图的组反k最近邻查询研究

张丽平刘蕾郝晓红李松郝忠孝《计算机研究与发展》2017,54(4):861-871

为了解决已有研究成果无法有效处理障碍空间中的组反k最近邻查询问题,提出了障碍物环境中基于Voronoi图的OGRkNN查询方法,该方法获得的结果集是将一组查询点中任意一点作为障碍kNN的数据点集合,在实际应用中可以用来评估一组查询对象的影响力.依据障碍物集合是否发生变化提出了2种情况下的OGRkNN查询方法,一种是静态障碍物环境下的OGRkNN查询(简称STA_OGRkNN查询)方法,另一种是动态障碍物环境下的OGRkNN查询(简称DYN_OGRkNN查询)方法.其中STA_OGRkNN查询方法利用Voronoi图的邻接特性可以在剪枝阶段有效地过滤掉大量的非候选者,快速地缩小查询范围,提高整个算法的查询效率,在精炼阶段有效地提高了算法的准确性.进一步给出了3种情况下的DYN_OGRkNN查询方法,分别为障碍物动态增加情况下的OGRkNN查询算法、障碍物动态减少情况下的OGRkNN查询算法以及障碍物动态移动情况下的OGRkNN查询算法.理论研究和实验结果表明所提算法具有较高效率. 相似文献