首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The paper presents a simple and effective method for the concurrent manipulation of linearly ordered data structures on hypercube systems. The method Is based on the existence of an augmented binomial search tree, called the pruned binomial tree, rooted at any arbitrary processor node of the hypercube such that; every edge of the tree corresponds to a direct link between a pair of hypercube nodes; and the tree spans any arbitrary sequence of n consecutive nodes containing the root, using a fanout of at most [log2 n] and a depth of at most [log2 n]+1. Search trees spanning nonoverlapping processor lists are formed using only local information, and can be used concurrently without contention problems. Thus, they can be used for performing operations such as broadcast and merge simultaneously on sets with nonuniform sizes. Extensions of the tree to k-ary n-cubes and faulty hypercubes are presented. Applications of this concurrent data structure to low- and intermediate-level image processing algorithms, and for dictionary operations involving multiple keys, are also outlined  相似文献   

2.
We give an optimal algorithm that broadcasts on an n-dimensional hypercube in O(n/ log2 (n+1)) routing steps with wormhole, e-cube routing and all-port communication. Previously, the best algorithm of P.K. McKinley and C. Trefftz (1993) requires [n/2] routing steps. We also give routing algorithms that achieve tight time bounds for n ⩽7  相似文献   

3.
This paper presents efficient algorithms that implement one-to-many, or multicast, communication in wormhole-routed torus networks. By exploiting the properties of the switching technology and the use of virtual channels, a minimum-time multicast algorithm is presented for n-dimensional torus networks that use deterministic, dimension-ordered routing of unicast messages. The algorithm can deliver a multicast message to m-1 destinations in [log2 m] message-passing steps, while avoiding contention among the constituent unicast messages. Performance results of a simulation study on torus networks with up to 4096 nodes are also given  相似文献   

4.
Based on the V.E. Mendia and D. Sarkar's algorithm (1992), we propose an optimal and nonredundant distributed broadcasting algorithm in star graphs. For an n-dimensional star graph, our algorithm takes O(n log2 n) time and guarantees that all nodes in the star graph receive the message exactly once. Moreover, broadcasting m packets in a pipeline fashion takes O(m log2 n+n log2 n) time due to the nonredundant property of our broadcasting algorithm  相似文献   

5.
This paper presents a family of algorithms for producing, from (υ0, υ1, ..., υn-1), all initial prefixes xi0&thetas;υ1 &thetas;···&thetas;υi (i=0, 1, ..., n-1) in parallel in interconnection networks such as the omega network and the hypercube, where &thetas; is an associative binary operator. Each algorithm can be embedded in the switches and interconnections of the network, and can be executed in O((log2 r+1) logr n) time steps provided that the network connecting n processors is constructed by using an r×r switch, and that parallelism within as well as among individual switches is exploited. The objective of these algorithms is to attain a communication pattern that fits the topology of the network. One type of network can be made equivalent to, or can be embedded in, another type of network, so a family of algorithms can be derived from one basic algorithm. In the basic algorithm, every processor pi upward multicasts υi to processors pk (k=i+1, i+2, ..., n - 1). En route to pi, υj (j=0, 1, ..., i - 1) are combined in the switches to produce the (i - 1)th initial prefix xi-1 that is received by pi, which can then compute the ith initial prefix xi=xi-1&thetas;υi  相似文献   

6.
Richard Hart 《Software》1980,10(5):405-417
A method for inventing algorithms and several examples are presented. The Julian conversion is useful for all date calculations and for switching to daylight saving time. The algorithm rotates the calendar by 2 months and performs a few straightforward calculations on the new month-day-year to produce the Julian date (and back again). The butterfly sort is based on the work on Woodrum. The algorithm repeatedly delivers two linked lists to a merge machine that produces a single linked list for further use. The length of the two lists going into the merge machine never differs by more than one. This accounts for the theoretical minimal number of comparisons5. The algorithm uses patterns within a simple binary counter to schedule the delivery of these lists. This accounts for the minimal number of steps between each comparison3.  相似文献   

7.
A practical interconnection network RP(k) and its routing algorithms   总被引:8,自引:0,他引:8  
Based on Petersen graph, a new interconnection network, the RP(k) network, is devel-oped and the properties of the RP(k) network are investigated. The diameter of the RP(k) network is [ k/2] + 2 and its degree is 5. We prove that the diameter of the RP(k) network is much smaller than that of the 2-D Torus network when the number of nodes in interconnection networks is less than or equal to 300. In order to analyze the communication performance in a group of nodes, we propose the concepts of the optimal node groups and the diameter of the optimal node groups. We also show that the diameter of the optimal node groups in the RP(k) network is less than that in the 2-D Torus net-work. Especially when the number of nodes in an optimal node group is between 6 and 100, the diam-eter of the optimal node groups in the RP(k) network is half of that in the 2-D Torus network. Further-more based on the RP(k) network we design a set of routing algorithms which are point-to-point rout-ing, permutation routing, one-to-al  相似文献   

8.
并行归并排序算法   总被引:3,自引:0,他引:3  
构造效率为O(1)的并行算法是一个引人注目的问题。[1]和[2]分别提出了并行度为O(logn)和O(n^1/2)的、效率为O(1)的并行排序算法。本文提出一种新的并行排序算法,其效率为O(1),而并行步数小于[1]和[2]的算法的并行步数。经过改进后,在保持效率为O(1)的情况下,可进一步将并行度扩大到O(n^1/2log n)。  相似文献   

9.
We consider the problems of routing and sorting on a de Bruijn network. First, we show that any deterministic oblivious routing scheme for permutation routing on a d-ary de Bruijn network with N=dn nodes, in the worst case, will take Ω(√N) steps under the single-port model. This improves the existing lower bounds provided d is not a constant. We also show that the lower bound is indeed a tight one. Second, we present a deterministic nonoblivious permutation routing algorithm which runs in O(d.n2) time on a d-ary de Bruijn network with N=dn nodes. This algorithm is currently the fastest known nonoblivious deterministic routing algorithm for de Bruijn networks of arbitrary degree. Finally, we present an efficient general sorting algorithm for the de Bruijn networks of arbitrary degree. This algorithm is the best sorting algorithm known so far. It runs in O((log d).d.n2) time for directed de Bruijn network with dn nodes, degree d, and diameter n. As a corollary, we show that on a binary de Bruijn network of Nnodes, our sorting scheme requires at most 2 log2 Nsteps  相似文献   

10.
We generalize directed loop networks to loop-symmetric networks in which there are N nodes and in which each node has in-degree and out-degree k, subject to the condition that 2k does not exceed N. We show that by proper selection of links one can obtain generalized loop networks with optimal or close to optimal diameter and connectivity. The optimized diameter is less than k[N1k/], where [x] indicates the ceiling of x. We also show that these networks are rather compact in that the diameter is not more than twice the average distance. Roughly 1/2(k-1)N1k/ nodes can be removed such that the network of remaining nodes is still strongly connected, if all remaining nodes have at least one incoming and one outgoing link left  相似文献   

11.
This paper concerns noncausal, nonlinear discrete singular systems of the formaX_{k+1} + Bx_{k} = m[x]_{k}on the infinite intervali = [0, 1, --.]for some causal, nonlinear time variable operatorsm. Existence theorems for bounded solutions are given. Methods for estimating the infinite time problem by a finite time problem are developed.  相似文献   

12.
The problem of merging k (k⩾2) sorted lists is considered. We give an optimal parallel algorithm which takes O((n log k/p)+log n) time using p processors on a parallel random access machine that allows concurrent reads and exclusive writes, where n is the total size of the input lists. This algorithm achieves O(log n) time using p=n log k/log n processors. Most of the previous log n research for this problem has been focused on the case when k=2. Very recently, parallel solutions for the case when k=2 have been reported. Our solution is the first logarithmic time optimal parallel algorithm for the problem when k⩾2. It can also be seen as a unified optimal parallel algorithm for sorting and merging. In order to support the algorithm, a new processor assignment strategy is also presented  相似文献   

13.
We generalize the well-known odd-even merge sorting algorithm, originally due to Batcher (1968), and show how this generalized algorithm can be applied to sorting on product networks. If G is an arbitrary factor graph with N nodes, its r-dimensional product contains Nr nodes. Our algorithm sorts Nr keys stored in the r-dimensional product of G in O(rrF(N)) time, where F(N) depends on G. We show that, for any factor graph G, F(N) is, at most, O(N), establishing an upper bound of O(r2 N) for the time complexity of sorting Nr keys on any product network. For product networks with bounded r(e.g. for grids), this leads to the asymptotic complexity of O(N) to sort Nr keys, which is optimal for several instances of product networks. There are factor graphs for which F(N)=O(log2 N), which leads to the asymptotic running time of O(log2 N) to sort Nr keys. For networks with bounded N (e.g. in the hypercube N=2, fixed), the asymptotic complexity becomes O(r2). We show how to apply the algorithm to several cases of well-known product networks, as well as others introduced recently. We compare the performance of our algorithm to well-known algorithms developed specifically for these networks, as well as others. The result of these comparisons led us to conjecture that the proposed algorithm is probably the best deterministic algorithm that can be found in terms of the low asymptotic complexity with a small constant  相似文献   

14.
针对Top-k dominating查询算法需要较高的时空消耗来构建属性组合索引,并且在相同属性值较多情况下的查询结果准确率低等问题,提出一种通过B+-trees和概率分布模型相结合的子空间支配查询算法--Ranking-k算法.首先,采用B+-trees为待查找数据各属性构建有序列表;然后,采取轮询调度算法读取skyline准则涉及到的有序列表,生成候选元组并获得k组终结元组;其次,根据生成的候选元组和终结元组,采用概率分布模型计算终结元组支配分数.迭代上述过程优化查询结果,直到满足条件为止.实验结果表明:Ranking-k与基本扫描算法(BSA)相比,查询效率提高了94.43%;与差分算法(DA)相比,查询效率提高了7.63%;与早剪枝Top-k支配(TDEP)算法、BSA和DA相比,查询结果更接近理论值.  相似文献   

15.
Sorting networks of fixed I/O size p have been used, thus far, for sorting a set of p elements. Somewhat surprisingly, the important problem of using such a sorting network for sorting arbitrarily large datasets has not been addressed in the literature. Our main contribution is to propose a simple sorting architecture whose main feature is the pipelined use of a sorting network of fixed I/O size p to sort an arbitrarily large data set of N elements. A noteworthy feature of our design is that no extra data memory space is required, other than what is used for storing the input. As it turns out, our architecture is feasible for VLSI implementation and its time performance is virtually independent of the cost and depth of the underlying sorting network. Specifically, we show that by using our design N elements can be sorted in Θ(N/p log N/p) time without memory access conflicts. Finally, we show how to use an AT2-optimal sorting network of fixed I/O size p to construct a similar architecture that sorts N elements in Θ(N/p log N/p log p) time  相似文献   

16.
并行计算机系统互连网络的拓扑性质对系统功能的实现起着重要的作用。为了衡量基于[k]元[n]方体网络构建的并行计算机系统的容错能力,研究了边故障模型下[k]元[n]方体网络中[k]元[(n-1)]方体子网络的可靠性。当[k(k≥3)]为奇数时,分别在固定划分模式和灵活划分模式下得出了[k]元[n]方体网络中不同数目的[k]元[(n-1)]方体子网络保持无故障状态的平均失效时间的计算公式,并通过仿真实验验证了理论结果的精确性。研究表明,当[k]为奇数的[k]元[n]方体网络中有边故障发生时,相比固定划分模式,在灵活划分模式下不同数目的[k]元[(n-1)]方体子网络保持无故障状态的平均失效时间更大。  相似文献   

17.
分治策略的思想是将一个规模较大的问题分解为多个形式相同的子问题来解决。搜索是指在一个排好序的数组中寻找与给定数值x相等的元素,传统的搜索算法是遍历,而二分搜索是一种基于分治策略的搜索算法。二分搜索是将数组每次分为相等的两部分,将待查元素x与数组中间的元素比较,若相等则搜索成功;否则将搜索范围缩小为原来的一半,之后以此类推,直到找到待查元素,与遍历相比,二分搜索复杂度明显降低。以二分搜索为基础,每次可以将数组分为更多部分,即k分搜索,探寻k为何值时k分搜索算法的时间复杂度最低,能够对搜索算法进一步优化。通过分析、归纳与证明,得出k分搜索的时间复杂度为O(klogkn),由于该函数是递增的,因此二分搜索是效率最高的搜索算法,复杂度为O(log2n);此外,当k=n时,k分搜索退化为遍历,复杂度退化为O(n)。  相似文献   

18.
We discuss the problem of packing hypercubes into an n-dimensional star graph S(n), which consists of embedding a disjoint union of hypercubes U into S(n) with load one. Hypercubes in U have from [n/2] to (n+1)·[log2 n]-2([lod2n]+1)+2 dimensions, i.e., they can be as large as any hypercube which can be embedded with dilation at most four into S(n). We show that U can be embedded into S(n) with optimal expansion, which contrasts with the growing expansion ratios of previously known techniques. We employ several performance metrics to show that, with our techniques, a star graph can efficiently execute heterogeneous workloads containing hypercube, mesh, and star graph algorithms. The characterization of our packings includes some important metrics which have not been addressed by previous research (namely, average dilation, average congestion, and congestion). Our packings consistently produce small average congestion and average dilation, which indicates that the induced communication slowdown is also small. We consider several combinations of node mapping functions and routing algorithms in S(n), and obtain their corresponding performance metrics using either mathematical analysis or computer simulation  相似文献   

19.
Fault-free Hamiltonian cycles in faulty arrangement graphs   总被引:1,自引:0,他引:1  
The arrangement graph An,k, which is a generalization of the star graph (n-k=1), presents more flexibility than the star graph in adjusting the major design parameters: number of nodes, degree, and diameter. Previously, the arrangement graph has proved Hamiltonian. In this paper, we further show that the arrangement graph remains Hamiltonian even if it is faulty. Let |Fe| and |Fv| denote the numbers of edge faults and vertex faults, respectively. We show that An,k is Hamiltonian when 1) (k=2 and n-k⩾4, or k⩾3 and n-k⩾4+[k/2]), and |Fe|⩽k(n-k)-2, or 2) k⩾2, n-k⩾2+[k/2], and |Fe|⩽k(n-k-3)-1, or 3) k⩾2, n-k⩾3, and |Fe |⩽k, or 4) n-k⩾3 and |Fv|⩽n-3, or 5) n-k⩾3 and |Fv|+|Fe|⩽k. Besides, for An,k with n-k=2, we construct a cycle of length at least 1) [n!/(n-k!)]-2 if |Fe|⩽k-1, or 2) [n!/(n-k)!]-|Fv |-2(k-1) if |Fv|⩽k-1, or 3) [n!/(n-k)!]-|Fv |-2(k-1) if |Fe|+|Fv|⩽k-1, where [n!/(n-k)!] is the number of nodes in An,k  相似文献   

20.
Applies the technique of parallel processing to concept learning. A parallel version-space learning algorithm based upon the principle of divide-and-conquer is proposed. Its time complexity is analyzed to be O(k log2n) with n processors, where n is the number of given training instances and k is a coefficient depending on the application domains. For a bounded number of processors in real situations, a modified parallel learning algorithm is then proposed. Experimental results are then performed on a real learning problem, showing that our parallel learning algorithm works, and being quite consistent with the results of theoretical analysis. We conclude that when the number of training instances is large, it is worth learning in parallel because of its faster execution  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号