期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A parallel hash join algorithm for managing data skew

Wolf J.L. Yu P.S. Turek J. Dias D.M. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(12):1355-1371

Presents a parallel hash join algorithm that is based on the concept of hierarchical hashing, to address the problem of data skew. The proposed algorithm splits the usual hash phase into a hash phase and an explicit transfer phase, and adds an extra scheduling phase between these two. During the scheduling phase, a heuristic optimization algorithm, using the output of the hash phase, attempts to balance the load across the multiple processors in the subsequent join phase. The algorithm naturally identifies the hash partitions with the largest skew values and splits them as necessary, assigning each of them to an optimal number of processors. Assuming for concreteness a Zipf-like distribution of the values in the join column, a join phase which is CPU-bound, and a shared nothing environment, the algorithm is shown to achieve good join phase load balancing, and to be robust relative to the degree of data skew and the total number of processors. The overall speedup due to this algorithm is compared to some existing parallel hash join methods. The proposed method does considerably better in high skew situations 相似文献

2.

高速缓存优化的并行连接算法

胡泽林张云泉《计算机工程与设计》2009,30(20)

由于嵌套循环连接操作过程中存在较大的高速缓存缺失,严重影响了连接查询的性能.提出了一种基于缓冲的高速缓存参数无关的嵌套循环并行连接算法.通过高速缓存参数无关和缓冲技术,提高了连接算法的空间局部性和时间局部性.理论分析和实验结果表明,高速缓存优化后的串行连接算法的性能是原来的2倍,其并行算法效果近似线性加速比. 相似文献

3.

A parallel sort merge join algorithm for managing data skew

Wolf J.L. Dias D.M. Yu P.S. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(1):70-86

A parallel sort-merge-join algorithm which uses a divide-and-conquer approach to address the data skew problem is proposed. The proposed algorithm adds an extra, low-cost scheduling phase to the usual sort, transfer, and join phases. During the scheduling phase, a parallelizable optimization algorithm, using the output of the sort phase, attempts to balance the load across the multiple processors in the subsequent join phase. The algorithm naturally identifies the largest skew elements, and assigns each of them to an optimal number of processors. Assuming a Zipf-like distribution of data skew, the algorithm is demonstrated to achieve very good load balancing for the join phase, and is shown to be very robust relative, among other things, to the degree of data skew and the total number of processors 相似文献

4.

A join algorithm for combining AND parallel solutions in AND/OR parallel systems

Balkrishna Ramkumar Laxmikant V. Kalé 《International journal of parallel programming》1992,21(1):67-107

When two or more literals in the body of a Prolog clause are solved in (AND) parallel, their solutions need to bejoined to compute solutions for the clause. This is often a difficult problem in parallel Prolog systems that exploit OR and independent AND parallelism in Prolog programs. In several AND/OR parallel systems proposed recently, this problem is side-stepped at the cost of unexploited OR parallelism in the program, in part due to the complexity of the backtracking algorithm beneath AND parallel branches. In some cases, the data dependency graphs used by these systems cannot represent all the exploitable indenpendent AND parallelism known at compile time.In this paper, we describe the compile time analysis for an optimizedjoin algorithm for supporting independent AND parallelism in logic programs efficiently without leaving any OR parallelism unexploited. We then discuss how this analysis can be used to yield very efficient runtime behavior. We also discuss problems associated with a tree representation of the search space when arbitrarily complex data dependency graphs are permitted. We describe how these problems can be resolved by mapping the search space onto the data dependency graphs themselves. The algorithm has been implemented in a compiler for parallel Prolog based on the Reduce-OR process model. The algorithm is suitable for the implementation of AND/OR systems on both shared and nonshared memory machines. Performance on benchmark programs exhibiting AND and OR parallelism on one shared memory machine and one message passing machine is presented.This work was supported in part by NSF Grants CCR-87-00988 and CCR-89-02496.A shorter version of this paper appears in theProceedings of NACLP 1990. 相似文献

5.

一种有效的并行数据库动态负载平衡连接算法 总被引：1，自引：0，他引：1

关心欧增桂王玲《计算机工程与应用》2007,43(12):150-154

在基于Shared-nothing结构的并行数据库中,负载平衡一直是影响查询处理性能的重要因素。在数据库中频繁使用的连接操作会因为各种因素导致的负载倾斜和额外的通讯开销而降低数据库的整体性能。提出了一种基于RCMD分布方法的动态负载平衡连接算法,能够在连接操作的执行过程中动态调整各个结点的负载。理论分析和实验结果证明提出的算法能够有效地平衡负载,提高并行数据库的执行效率。相似文献

6.

PaMeCo join: A parallel main memory compact hash join

《Information Systems》2016

This paper presents a memory-constrained hash join algorithm (PaMeCo Join) designed to operate with main-memory column-store database systems. Whilst RAM has become more affordable and the popularity of main-memory database systems continues to grow, we recognize that RAM is a finite resource and that database systems rarely have an excess of memory available to them. Therefore, we design PaMeCo to operate within an arbitrary memory limitation by processing the input relations by parts, and by using a compact hash table that represents the contained tuples in a compact format. Coupled with a radix-clustering system that lowers memory latencies, we find that PaMeCo can offer competitive performance levels to other contemporary hash join algorithms in an unconstrained environment, while being up to three times faster than a high-performing hash join when memory constraints are applied. 相似文献

7.

并行框架下基于位图索引的多表星型连接算法

解晨光刘明刚《计算机工程与设计》2014,35(9)

分析面向大数据平台的MapReduce分布式编程技术以及实现数据查询时的连接算法,针对SSB数据模型,提出基于分布式缓存的多表星型连接优化技术.利用谓词向量技术,将维表中间连接的数据依赖转化为表上的位图索引过滤,减少数据依赖产生的巨大网络开销;采用分布式缓存技术充分利用处理节点的内存,优化网络传输,减少查询代价. 相似文献

8.

Coterie join algorithm

Neilsen M.L. Mizuno M. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(5):582-590

Given a set of nodes in a distributed system, a coterie is a collection of subsets of the set of nodes such that any two subsets have a nonempty intersection and are not properly contained in one another. A subset of nodes in a coterie is called a quorum. An algorithm, called the join algorithm, which takes nonempty coteries as input, and returns a new, larger coterie called a composite coterie is introduced. It is proved that a composite coterie is nondominated if and only if the input coteries are nondominated. Using the algorithm, dominated or nondominated coteries may be easily constructed for a large number of nodes. An efficient method for determining whether a given set of nodes contains a quorum of a composite coterie is presented. As an example, tree coteries are generalized using the join algorithm, and it is proved that tree coteries are nondominated. It is shown that the join algorithm may be used to generate read and write quorums which may be used by a replica control protocol 相似文献

9.

A parallel algorithm for approximate regularity

Laurence Boxer Russ Miller 《Information Processing Letters》2001,80(6):311-316

Spatial regularity amidst a seemingly chaotic image is often meaningful. Many papers in computational geometry are concerned with detecting some type of regularity via exact solutions to problems in geometric pattern recognition. However, real-world applications often have data that is approximate, and may rely on calculations that are approximate. Thus, it is useful to develop solutions that have an error tolerance.

A solution has recently been presented by Robins et al. [Inform. Process. Lett. 69 (1999) 189–195] to the problem of finding all maximal subsets of an input set in the Euclidean plane that are approximately equally-spaced and approximately collinear. This is a problem that arises in computer vision, military applications, and other areas. The algorithm of Robins et al. is different in several important respects from the optimal algorithm given by Kahng and Robins [Patter Recognition Lett. 12 (1991) 757–764] for the exact version of the problem. The algorithm of Robins et al. seems inherently sequential and runs in O(n^5/2) time, where n is the size of the input set. In this paper, we give parallel solutions to this problem. 相似文献

10.

A parallel algorithm for tiling problems 总被引：2，自引：0，他引：2

Takefuji Y. Lee Y.-C. 《Neural Networks, IEEE Transactions on》1990,1(1):143-145

A parallel algorithm for tiling with polyominoes is presented. The tiling problem is to pack polyominoes in a finite checkerboard. The algorithm using lxmxn processing elements requires O(1) time, where l is the number of different kinds of polyominoes on an mxn checkerboard. The algorithm can be used for placement of components or cells in a very large-scale integrated circuit (VLSI) chip, designing and compacting printed circuit boards, and solving a variety of two- or three-dimensional packing problems. 相似文献

11.

A parallel algorithm for color constancy

《Journal of Parallel and Distributed Computing》2004,64(1):79-88

Objects retain their color in spite of changes in the wavelength and energy composition of the light they reflect. This phenomenon is called color constancy and plays an important role in computer vision research. We have devised a parallel algorithm for color constancy. The algorithm runs on a two-dimensional grid of processors each of which can exchange information with its four neighboring processors. Each processor calculates local average color. This information is then used to estimate the reflectances of the object. The algorithm was tested on several images of everyday objects. The algorithm also works for scenes where the illuminant changes smoothly over the image. 相似文献

12.

A parallel algorithm for generating combinations

C.-J. Lin 《Computers & Mathematics with Applications》1989,17(12):1523-1533

A parallel algorithm for generating all combinations of m items out of n given items in lexicographic order is presented. The computational model is a linear systolic array consisting of m identical processing elements. It takes (_mⁿ) time-steps to generate all the (_mⁿ) combinations. Since any processing element is identical and executes the same procedure, it is suitable for VLSI implementation. Based on mathematical induction, such algorithm is proved to be correct. 相似文献

13.

Distributed load balancing for parallel main memory hash join

Tout W.R. Praminik S. 《Parallel and Distributed Systems, IEEE Transactions on》1995,6(8):841-849

Parallel joins have been widely studied during the past decade and a number of efficient algorithms were presented. While it is known that the performance of these algorithms may suffer greatly in the presence of skewed input data, the work on load balancing schemes for parallel join has been limited. The main contribution of this paper is the development and analysis of a new distributed data structure and an effective load balancing scheme for parallel main memory hash join on NUMA architecture. Multiprocessors based on this architecture are scalable in both size of main memory and number of processors, and provide very high memory bandwidth. The load balancing scheme is based on random probing to avoid the hot spot problems caused by probing sequentially. We have modeled this load balancing scheme both analytically and experimentally. The experiments were run on a BBN TC2000 multiprocessor system 相似文献

14.

Utilizing page-level join index for optimization in parallel joinexecution

Chiang Lee Zue-An Chang 《Knowledge and Data Engineering, IEEE Transactions on》1995,7(6):900-914

This paper presents a methodology for the optimization of parallel join execution. Past research on parallel join methods mostly focused on the design of algorithms for partitioning (e.g. hash) relations and distributing data buckets as evenly as possible to the processors. Once data is distributed to the processors, it assumes that all processors will complete their tasks at about the same time. We stress that this is true if no further information such as page-level join index is available. Otherwise, the join execution can be further optimized and the workload in the processors may still be unbalanced. We study such problems that may incur in a shared-nothing architecture environment and propose algorithms for the problems. Also, a simulation study is performed to understand the characteristics of the proposed method 相似文献

15.

The NUMA with clusters of processors for parallel join

Pramanik S. Tout W.R. 《Knowledge and Data Engineering, IEEE Transactions on》1997,9(4):653-660

A number of hybrid systems have been proposed to combine the advantages of shared nothing and shared everything concepts for computing relational join operations. Most of these proposed systems, however, presented a few analytical results and have produced limited or no implementations on actual multiprocessors. In this paper, we present a parallel join algorithm with load-balancing for a hybrid system that combines both shared-nothing and shared-everything architectures. We derive an analytical model for the join algorithm on this architecture and validate it using both hardware/software simulations and actual experimentations. We study the performance of the join on the hybrid system for a wide range of system parameter values. We conclude that the hybrid system outperforms both shared-nothing and shared-everything architectures 相似文献

16.

一种优化的空间连接算法

下载免费PDF全文

邹永贵徐海波梁新发杨富平《计算机工程与应用》2011,47(12):117-121

空间数据库中空间连接操作是最重要、最耗时的操作之一,基于BFRJ算法研究了一种对中间连接索引优化排序的空间连接算法OBFRJ,该算法使用广度优先顺序对两棵R树进行同步遍历,对生成的中间连接索引采用了一种空间填充曲线进行排序,使得在下一层的连接时出现页错误的次数减少。实验结果表明,该算法在磁盘访问次数以及CPU代价上都要小于DFRJ和BFRJ算法。相似文献

17.

A parallel polygon-clipping algorithm

Chandrasekhar Narayanaswami 《The Visual computer》1996,12(3):147-158

We describe a new parallel polygonclipping algorithm based on a novel technique that allows a processor to compute output vertices independently of the results of the other processors. The basis for the method is a collision-free labeling scheme to compute the labels of the vertices of the output polygon. This labeling scheme depends only on the id of the vertices of the output polygon. This labeling scheme depends only on the id of the vertex in the input polygon. This procedure allows us to defer the synchronization between processors to the final stages of the algorithm, reduces the amount of overhead due to fine-grain synchronization, and helps makes the algorithm efficient. 相似文献

18.

A parallel median algorithm

Richard Cole Chee K. Yap 《Information Processing Letters》1985,20(3):137-139

We give a deterministic algorithm for finding the kth smallest item in a set of n items, running in O((log log n)²) parallel time on O(n) processors in Valiant's comparison model. 相似文献

19.

A parallel merging algorithm

R.H. Barlow D.J. Evans J. Shanehchi 《Information Processing Letters》1981,13(3):103-106

相似文献

20.

A parallel algorithm for stochastic image segmentation

Don HS Fu KS 《IEEE transactions on pattern analysis and machine intelligence》1986,(5):594-603

A parallel algorithm for syntactic image segmentation is introduced. Stochastic tree grammar is used as a context-generating model. It is shown that when this context-generating process is in the equilibrium state, a matched filter can be designed and applied in parallel to the image. This process can be used for image segmentation in a syntactic pattern recognition system to enhance the performance of the succeeding recognition process. 相似文献