首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
This paper gives hypercube algorithms for some simple problems involving geometric properties of sets of points. The properties considered emphasize aspects of convexity and domination. Efficient algorithms are given for both fine- and medium-grain hypercube computers, including a discussion of implementation, running times and results on an Intel iPSC hypercube, as well as theoretical results. For both serial and parallel computers, sorting plays an important role in geometric algorithms for determining simple properties, often being the dominant component of the running time. Since the time required to sort data on a hypercube computer is still not fully understood, the running times of some of our algorithms for unsorted data are not completely determined. For both the fine- and medium-grain models, we show that faster expected-case running time algorithms are possible for point sets generated randomly. Our algorithms are developed for sets of planar points, with several of them extending to sets of points in spaces of higher dimension.The research of E. Cohen, R. Miller, and E. M. Sarraf was partially supported by National Science Foundation Grant ASC-8705104. R. Miller was also partially supported by National Science Foundation Grants DCR-8608640 and IRI-8800514. Q. F. Stout's research was partially supported by National Science Foundation Grant DCR-85-07851, and an Incentives for Excellence Grant from the Digital Equipment Corporation.  相似文献   

2.
Although mesh-connected computers are used almost exclusively for low-level local image processing, they are also suitable for higher level image processing tasks. We illustrate this by presenting new optimal (in the O-notational sense) algorithms for computing several geometric properties of figures. For example, given a black/white picture stored one pixel per processing element in an n × n mesh-connected computer, we give ?(n) time algorithms for determining the extreme points of the convex hull of each component, for deciding if the convex hull of each component contains pixels that are not members of the component, for deciding if two sets of processors are linearly separable, for deciding if each component is convex, for determining the distance to the nearest neighboring component of each component, for determining internal distances in each component, for counting and marking minimal internal paths in each component, for computing the external diameter of each component, for solving the largest empty circle problem, for determining internal diameters of components without holes, and for solving the all-points farthest point problem. Previous mesh-connected computer algorithms for these problems were either nonexistent or had worst case times of ?(n2). Since any serial computer has a best case time of ?(n2) when processing an n × n image, our algorithms show that the mesh-connected computer provides significantly better solutions to these problems.  相似文献   

3.
Efficient Collective Communications in Dual-Cube   总被引:1,自引:2,他引:1  
The hypercube, or n-cube, has been widely used as the interconnection network in parallel computers. However, the major drawback of the hypercube is the increase in the number of communication links for each node with the increase in the total number of nodes in the system. This paper introduces a new interconnection network, namely dual-cube, for large-scale parallel computers and describes the algorithms for efficient collective communications in dual-cube. The dual-cube network mitigates the problem of increasing number of links in the large-scale hypercube network while retains hypercube's topological properties. Design of efficient routing algorithms for collective communications is the key issue for any interconnection network. In this paper, we show that the collective communications can be done in dual-cube with almost the same communication times as in hypercube.  相似文献   

4.
In this paper it is investigated which pivots may be processed simultaneously when solving a set of linear equations. It is shown that for dense sets of equations all the pivots must necessarily be processed one at a time; only if the set is sufficiently sparse, some pivots may be processed simultaneously. We present parallel pivoting algorithms for MIMD computers with sufficiently many processors and a common memory. Moreover we present algorithms for MIMD computers with an arbitrary, but fixed number of processors. For both types of computers algorithms embodying an ordering strategy are given.  相似文献   

5.
This article presents PFCM, a parallel algorithm for fuzzy clustering of large data sets. Being a generalization of FCM, the algorithm enables arbitrary numbers of data points, features and clusters to be handled cost-optimally by hypercube SIMD computers of arbitrary cube dimension, the only limitation being the size of the local memories of the processors. Speedup responds optimally to enlarging the hypercube. PFCM owes its flexibility to the technique employed in its derivation from the sequential fuzzy C-means algorithm FCM: the association of each of the three dimensions of the problem (numbers of data points, features and clusters) with a distinct subset of hypercube dimensions.  相似文献   

6.
Squared error clustering algorithms for single-instruction multiple-data (SIMD) hypercubes are presented. The algorithms are shown to be asymptotically faster than previously known algorithms and require less memory per processing element (PE). For a clustering problem with N patterns, M features per pattern, and K clusters, the algorithms complete in O(k+log NM ) steps on NM processor hypercubes. This is optimal up to a constant factor. These results are extended to the case in which NMK processors are available. Experimental results from a multiple-instruction, multiple-data (MIMD) medium-grain hypercube are also presented  相似文献   

7.
An important aspect of database processing in parallel computer systems is the use of data parallel algorithms. Several parallel algorithms for the relational database join operation in a hypercube multicomputer system are given. The join algorithms are classified as cycling or global partitioning based on the tuple distribution method employed. The various algorithms are compared under a common framework, using time complexity analysis as well as an implementation on a 64-node NCUBE hypercube system. In general, the global partitioning algorithms demonstrate better speedup. However, the cycling algorithm can perform better than the global algorithms in specific situations, viz., when the difference in input relation cardinalities is large and the hypercube dimension is small. The usefulness of the data redistribution operation in improving the performance of the join algorithms, in the presence of uneven data partitions, is examined. The results indicate that redistribution significantly decreases the join algorithm execution times for unbalanced partitions  相似文献   

8.
《Parallel Computing》1988,7(1):1-10
One of the desirable aspects of a hypercube is that many other interconnection topologies are contained within it. Two commonly used topologies are the ring and two-dimensional grid. The mapping of these topologies onto the hypercube is straightforward, but in the FPS T-Series hypercube, algorithms using the standard mapping based on binary reflective Gray codes will not perform well. This is because the standard mapping requires the use of communication links that, in some of the nodes, cannot be communicated on simultaneously. In such nodes, a very time consuming reset of the link configuration is necessary between every use of the conflicting links. For many algorithms, this resilts in a large overhead and degrades performance. In this paper it is shown how to configure the links in each node once to map a ring and grid onto the T-Series, thereby eliminating the overhead of resetting the links repeatedly during execution. The mappings are extended to a more general class of hypercube called modulus link-bounded hypercubes, and various properties of the mappings are presented.  相似文献   

9.
并行油藏模拟软件的实现及在国产高性能计算机上的应用   总被引:5,自引:0,他引:5  
主要介绍了百万网格点规模的精细油藏数值模拟在国产高性能并行计算机与微机机群系统上的应用情况 .针对若干组来自于国内油田的百万网格点实际数据 ,给出了在多种国产并行机环境下的运行结果 ,并作了分析与评价 .在此基础上 ,讨论并行油藏数值模拟软件高效实现过程中遇到的关键技术 ,探讨大型软件并行化过程中经常遇到的瓶颈问题及改进方案  相似文献   

10.
Abstract. The traditional worst-case analysis often fails to predict the actual behavior of the running time of geometric algorithms in practical situations. One reason is that worst-case scenarios are often very contrived and do not occur in practice. To avoid this, models are needed that describe the properties that realistic inputs have, so that the analysis can take these properties into account. We try to bring some structure to this emerging research direction. In particular, we present the following results: • We show the relations between various models that have been proposed in the literature. • For several of these models, we give algorithms to compute the model parameter(s) for a given (planar) scene; these algorithms can be used to verify whether a model is appropriate for typical scenes in some application area. • As a case study, we give some experimental results on the appropriateness of some of the models for one particular type of scene often encountered in geographic information systems, namely certain triangulated irregular networks.  相似文献   

11.
Milos  Amiya  Jovisa   《Pattern recognition》2008,41(8):2503-2511
Our goal is to design algorithms that give a linearity measure for planar point sets. There is no explicit discussion on linearity in literature, although some existing shape measures may be adapted. We are interested in linearity measures which are invariant to rotation, scaling, and translation. These linearity measures should also be calculated very quickly and be resistant to protrusions in the data set. The measures of eccentricity and contour smoothness were adapted from literature, the other five being triangle heights, triangle perimeters, rotation correlation, average orientations, and ellipse axis ratio. The algorithms are tested on 30 sample curves and the results are compared against the linear classifications of these curves by human subjects. It is found that humans and computers typically easily identify sets of points that are clearly linear, and sets of points that are clearly not linear. They have trouble measuring sets of points which are in the gray area in-between. Although they appear to be conceptually very different approaches, we prove, theoretically and experimentally, that eccentricity and rotation correlation yield exactly the same linearity measurements. They however provide results which are furthest from human measurements. The average orientations method provides the closest results to human perception, while the other algorithms proved themselves to be very competitive.  相似文献   

12.
13.
参数化为构造B样条插值曲线提供了自由度,但在以往的研究中,这些自由度并未得到充分利用.该文给出的二次B样条曲线插值方法充分利用了参数化的自由度,直接利用插值曲线直观的几何约束条件如曲线在数据点处的切向、曲线段的相对高度等进行参数化,使得构造出的插值曲线不仅在两端,而且在中间各段具有预期的几何性质.该文的方法比起以往的参数化方法来,能更直观有效地控制插值曲线的形状.而且,所构造的插值曲线具有局部性质或近似局部性质,即当改变某个数据点的位置时,插值曲线的形状只作局部改变或除局部范围外,曲线形状改变很小或完全不变.不同于以往的插值方法,该文的方法在构造插值曲线的过程中根据曲线的几何约束条件动态地递推确定参数值、节点向量和控制顶点,整个过程不必解方程组,计算简便.该文还给出了相应的算法和应用例子.实验结果表明,该文的方法十分有效.  相似文献   

14.
Comparison of Distance Measures for Planar Curves   总被引:1,自引:0,他引:1  
The Hausdorff distance is a very natural and straightforward distance measure for comparing geometric shapes like curves or other compact sets. Unfortunately, it is not an appropriate distance measure in some cases. For this reason, the Fréchet distance has been investigated for measuring the resemblance of geometric shapes which avoids the drawbacks of the Hausdorff distance. Unfortunately, it is much harder to compute. Here we investigate under which conditions the two distance measures approximately coincide, i.e., the pathological cases for the Hausdorff distance cannot occur. We show that for closed convex curves both distance measures are the same. Furthermore, they are within a constant factor of each other for so-called κ-straight curves, i.e., curves where the arc length between any two points on the curve is at most a constant κ times their Euclidean distance. Therefore, algorithms for computing the Hausdorff distance can be used in these cases to get exact or approximate computations of the Fréchet distance, as well.  相似文献   

15.
Consideration is given to the problem of mapping systolic array algorithms into efficient algorithms for a fixed-size hypercube architecture. The authors describe in detail several optimal implementations of algorithms given for one-way one- and two-dimensional systolic arrays. Since interprocessor communication is many times slower than local computation in parallel computers built to date, the problem of efficient communication is specifically addressed for these mappings. In order to validate the technique experimentally, five systolic algorithms were mapped in various ways onto a 64-node NCUBE/7 MIMD hypercube machine. The algorithms are for the following problems: the shuffle scheduling problem, finite impulse response filtering, linear context-free language recognition, matrix multiplication, and computing the Boolean transitive closure. Experimental evidence indicates that good performance is obtained for the mappings  相似文献   

16.
The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52–56 Mflops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a real world simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application.  相似文献   

17.
Parallel computers are having a profound impact on computational science. Recently highly parallel machines have taken the lead as the fastest supercomputers, a trend that is likely to accelerate in the future. We describe some of these new computers, and issues involved in using them. We present elliptic PDE solutions currently running at 3.8 gigaflops, and an atmospheric dynamics model running at 1.7 gigaflops, on a 65 536-processor computer.

One intrinsic disadvantage of a parallel machine is the need to perform inter-processor communication. It is important to ensure that such communication time is maintained at a small fraction of computation time. We analyze standard multigrid algorithms in two and three dimensions from this point of view, indicating that performance efficiencies in excess of 95% are attainable under suitable conditions on moderately parallel machines. We also demonstrate that such performance is not attainable for multigrid on massively parallel computers, as indicated by an example of poor multigrid efficiency on 65 536 processors. The fundamental difficulty is the inability to keep 65 536 processors busy when operating on very coarse grids.

Most algorithms used for implementing applications on parallel machines have been derived directly from algorithms designed for serial machines. The previously mentioned multigrid example indicates that such ‘parallelized’ algorithms may not always be optimal. Parallel machines open the possibility of finding totally new approaches to solving standard tasks—intrinsically parallel algorithms. In particular, we present a class of superconvergent multiple scale methods that were motivated directly by massevely parallel machines. These methods differ from standard multigrid methods in an intrinsic way, and allow all processors to be used at all times, even when processing on the coarsest grid levels. Their serial versions are not sensible algorithms. The idea that parallel hardware—the Connection Machine in this case—can lead to discovery of new mathematical algorithms was surprising for us.  相似文献   


18.
Properties and performance of folded hypercubes   总被引:3,自引:0,他引:3  
A new hypercube-type structure, the folded hypercube (FHC), which is basically a standard hypercube with some extra links established between its nodes, is proposed and analyzed. The hardware overhead is almost 1/n, n being the dimensionality of the hypercube, which is negligible for large n. For this new design, optimal routing algorithms are developed and proven to be remarkably more efficient than those of the conventional n-cube. For one-to-one communication, each node can reach any other node in the network in at most [n/2] hops (each hop corresponds to the traversal of a single link), as opposed to n hops in the standard hypercube. One-to-all communication (broadcasting) can also be performed in only [n/2] steps, yielding a 50% improvement in broadcasting time over that of the standard hypercube. All routing algorithms are simple and easy to implement. Correctness proofs for the algorithms are given. For the proposed architecture, communication parameters such as average distance, message traffic density, and communication time delay are derived. In addition, some fault tolerance capabilities of this architecture are quantified and compared to those of the standard cube. It is shown that this structure offers substantial improvement over existing hypercube-type networks in terms of the above-mentioned network parameters  相似文献   

19.
In sequential systems, hash tables yield almost constant time performance for single element accesses. However, in massively parallel systems, we need to consider a large number of parallel accesses. Consequently, the potential queueing delay as well as the communication overhead can alter the relative performance of hash algorithms. Thus, it is necessary to reevaluate the performance of conventional hash algorithms and investigate new algorithms that exploit the parallelism without suffering from excessive communication overheads or queueing delays. In this paper, we first study the performance of data parallel hash algorithms with conventional collision resolution strategies. For SIMD/SPMD hypercube systems, neither linear probing nor double hashing yields satisfactory performance. Thus, we develop a new collision resolution strategy, namely, hypercube hashing. Hypercube hashing combines the randomness provided in double hashing with the low communication cost inherited from linear probing to yield better performance. We also investigate efficient implementation of the chaining algorithm in data parallel systems and its performance. From the simulation results, hypercube hashing significantly outperforms the other open addressing strategies in all cases (under the assumption of random input key space). For high load factors, chaining performs better than hypercube hashing. However, with a low load factor, hypercube hashing significantly outperforms the chaining algorithm.  相似文献   

20.
The hypercube is one of the most widely used topologies because it provides small diameter and embedding of various interconnection networks. For very large systems, however, the number of links needed with the hypercube may become prohibitively large. In this paper, we propose a hierarchical interconnection network based on hypercubes called hierarchical hypercube network (HHN) for massively parallel computers. The HHN has a smaller number of links than the comparable hypercube and in particular, when we construct networks with 2Knodes, the node degree of HHN with the minimum node degree isO([formula]) while that of hypercube isO(K). Regardless of its smaller node degree, many parallel algorithms can be executed in HHN with the same time complexity as in the hypercube.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号