首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider the problem of allocating n tasks of a distributed program to m processors of a distributed system in order to minimize total communication and processing costs. If the intertask communication can be represented by a tree and if the communication costs are uniform, it is known that an optimal allocation can be determined in O(nm) time. A K-optimal solution set Ω = { 1,..., K} of a given task allocation problem is a set of allocations such that no allocation which is not contained in Ω is better than any i, i = 1,..., K. In this paper, an algorithm is presented which computes a K-optimal set for the considered task allocation problem in O(Knm).  相似文献   

2.
The star graph is an attractive underlying topology for distributed systems. Robustness of the star graph under link failure model is addressed. Specifically, the minimum number of faulty links, f(nk), that make every (n − k)-dimensional substar Snk faulty in an n-dimensional star network Sn, is studied. It is shown that f(n,1)=n+2. Furthermore, an upper bound is given for f(n, 2) with complexity of O(n3) which is an improvement over the straightforward upper bound of O(n4) derived in this paper.  相似文献   

3.
We study the static load balancing problem in a distributed computer system that consists of a set of heterogeneous computer systems interconnected by a star network with two-way traffic. We formulate the static load balancing problem as a nonlinear optimization problem which minimizes the mean response time. We prove that in the optimal solution the satellite nodes in the star network can be divided into four different types: the idle source nodes, the active source nodes, the neutral nodes, and the sink nodes. The necessary and sufficient conditions for optimal solution are studied. An efficient algorithm of complexity O(n) is proposed for the static load balancing of an n-satellite system. The effects of link communication time on optimal load balancing in a star network are also studied by parametric analysis. By employing the proposed algorithm, a significant system performance improvement over that without load balancing is illustrated in a numerical example. The numerical example also shows that the effects of the link communication time in a star network are large.  相似文献   

4.
We consider the problems of selection, routing, and sorting on ann-star graph (withn! nodes), an interconnection network which has been proven to possess many special properties. We identify a tree like subgraph (which we call a “(k, 1,k) chain network”) of the star graph which enables us to design efficient algorithms for the above mentioned problems. We present an algorithm that performs a sequence ofnprefix computations inO(n2) time. This algorithm is used as a subroutine in our other algorithms. We also show that sorting can be performed on then-star graph in timeO(n3) and that selection of a set of uniformly distributednkeys can be performed inO(n2) time with high probability. Finally, we also present a deterministic (nonoblivious) routing algorithm that realizes any permutation inO(n3) steps on then-star graph. There exists an algorithm in the literature that can perform a single prefix computation inO(nlgn) time. The best-known previous algorithm for sorting has a run time ofO(n3lgn) and is deterministic. To our knowledge, the problem of selection has not been considered before on the star graph.  相似文献   

5.
In distributed shared memory multiprocessor systems, parallel tasks communicate through sharing memory data. As the system size increases, such communication cost becomes the main factor that limits the overall parallelism and performance. In this paper, we propose a new solution to the problem through judiciously managing the relevant resource, namely, the shared data and the interconnection network (IN) through which the sharing is carried out. In this approach, communication cost is minimized by means of data migration/allocation which is based on analyzing general layered task graphs, sharing behavior of parallel tasks, and network topology. Our method is not applicable for read only variables. Further, for the time being, the usefulness of the method is limited to multiprocessors where no cache coherence mechanism is implemented. Four typical interconnection topologies for multiprocessors are considered, namely, shared-bus, hierarchical-bus, 2-D mesh, and fat-tree structures. Efficient data allocation algorithms for each of the four network topologies are developed that make decision on data allocation/migration at the compile time. The complexity of one algorithm isO(np) for shared-bus andO(n2p) for the remaining three in a system withnprocessors executing ap-layer task graph for one shared variable. We have also given an algorithm to determine optimal allocation/migration scheme for multiple shared variables. However, the cost of the algorithm become prohibitive when the number of shared variables is high. Therefore, a heuristic of low complexity is suggested. The heuristic is optimal for some topologies.  相似文献   

6.
We show an embedding of the star graph into a rectangular optical multichannel mesh ofddimensions such that the embedding has no bends; that is, neighbors in the star graph always differ in exactly one coordinate in the mesh, to facilitate one-hop optical communication. To embed ann-star, the mesh can have any number of dimensionsdbetween 1 andn− 1. The embedding has load 1 and an expansion of at mostnd − 1/d!. The size of the mesh will be at most We optimize the size of the host mesh using clique-partitioning to produce embeddings with expansions as low as unity. In two dimensions, for evenn, the mesh will be no larger thann×n(n− 2)!, and have an expansion of no more than 1 1/(n− 1). Further, we show how we can use a contraction method to efficiently embed the star graph into an optical mesh with near-unity aspect ratios. Contraction on a two-dimensional embedding will yield a mesh of size no larger thann×nfor evennwith a load of (n− 2)!.  相似文献   

7.
Consider a distributed system consisting of n computers connected by a number of identical broadcast channels. All computers may receive messages from all channels. We distinguish between two kinds of systems: systems in which the computers may send on any channel (dynamic allocation) and system where the send port of each computer is statically allocated to a particular channel. A distributed task (application) is executed on the distributed system. A task performs execution as well as communication between its subtasks. We compare the completion time of the communication for such a task using dynamic allocation and channels with the completion time using static allocation and channels. Some distributed tasks will benefit very much from allowing dynamic allocation, whereas others will work fine with static allocation. In this paper we define optimal upper and lower bounds on the gain (or loss) of using dynamic allocation and channels compared to static allocation and channels. Our results show that, for some tasks, the gain of permitting dynamic allocation is substantial, e.g. when , there are tasks which will complete 1.89 times faster using dynamic allocation compared to using the best possible static allocation, but there are no tasks with a higher such ratio. Received: 26 February 1998 / 26 July 1999  相似文献   

8.
An increasing number of scientific programs exhibit two forms of parallelism, often in a nested fashion. At the outer level, the application comprises coarse-grained task parallelism, with dependencies between tasks reflected by an acyclic graph. At the inner level, each node of the graph is a data-parallel operation on arrays. Designers of languages, compilers, and runtime systems are building mechanisms to support such applications by providing processor groups and array remapping capabilities. In this paper we explore how to supplement these mechanisms with policy. What properties of an application, its data size, and the parallel machine determine the maximum potential gains from using both kinds of parallelism? It turns out that large gains can be expected only for specific task graph structures. For such applications, what are practical and effective ways to allocate processors to the nodes of the task graph? In principle one could solve the NP-complete problem of finding the best possible allocation of arbitrary processor subsets to nodes in the task graph. Instead of this, our analysis and simulations show that a simpleswitchedscheduling paradigm, which alternates between pure task and pure data parallelism, provides nearly optimal performance for the task graphs considered here. Furthermore, our scheme is much simpler to implement, has less overhead than the optimal allocation, and would be attractive even if the optimal allocation was free to compute. To evaluate switching in real applications, we implemented a switching task scheduler in the parallel numerical library ScaLAPACK and used it in a nonsymmetric eigenvalue program. Even for fairly large input sizes, the efficiency improves by factors of 1.5 on the Intel Paragon and 2.5 on the IBM SP-2. The remapping and scheduling overhead is negligible, between 0.5 and 5%.  相似文献   

9.
为了提高分布式存储系统中故障节点的修复效率, 提出一种新的部分重复(fractional repetition, FR)码的构造算法. 该算法利用完全图的因子分解进行构造, 称为CGFBFR (complete graph factorization based FR)码. 该算法首先对完全图进行因子分解, 分解完成以后确定完全图的因子分解个数, 根据需要存储数据块的重复度来选择完全图的因子个数, 将完全图选中的因子所有顶点当做分布式存储系统中需要存储的数据块, 然后对选中因子图的边进行标记, 标记的边当做分布式数据节点进行存储. 最后根据选中的因子的顶点和边生成编码矩阵, 在分布式存储系统中按照编码矩阵中的数据对数据块分别进行存储. 实验仿真结果显示, 本文提出的一种新的部分重复码构造算法, 与分布式存储系统中的里所(reed-solomon, RS)码、简单再生码(simple regenerating codes, SRC)以及最新的循环可变部分重复(variable fractional repetition, VFR)码相比, 在系统修复故障节点时, 能够快速地修复故障节点, 有效降低了故障节点的修复带宽开销、修复局部性、修复复杂度, 而且构造过程简单, 同时可以灵活选择构造参数, 广泛适用于分布式存储系统中.  相似文献   

10.
Allocating hard real-time tasks: An NP-Hard problem made easy   总被引:4,自引:0,他引:4  
A distributed hard real time system can be composed from a number of communicating tasks. One of the difficulties with building such systems is the problem of where to place the tasks. In general there are P T ways of allocating T tasks to P processors, and the problem of finding an optimal feasible allocation (where all tasks meet physical and timing constraints) is known to be NP-Hard. This paper describes an approach to solving the task allocation problem using a technique known as simulated annealing. It also defines a distributed hard real-time architecture and presents new analysis which enables timing requirements to be guaranteed.This work was supported in part by British Aerospace (Commercial Aircraft) Ltd, and the UK Department of Trade and Industry.  相似文献   

11.
针对智能建筑室内环境下并行计算的动态任务调度问题,构建了基于分布式CPS思想的无线传感器网络(WSN)模型,并分别设计了基于可计算复杂性的任务分配策略和基于动态调度算法的任务调度策略。通过先将任务分配成若干个子任务,采用多带图灵机输入任务,由合适的计算节点进行计算,形成有向无环图,再按调度优先级排列任务,形成任务调度序列表,依序处理任务,从而达到了将任务分配、调度和执行相结合的目的。实验结果表明该策略可有效减少智能建筑室内环境分布式可计算WSN分布运行时任务之间的通讯时间和等待时间,同时提高了任务调度的成功率,最终优化系统的运行效率。  相似文献   

12.
肖荣 《计算机工程》2010,36(11):70-72
提出使用网表示可分配寄存器对象,通过对网的活跃性数据流分析,构造网的冲突图。与变量冲突图相比,将基于变量的节点分裂成基于网的节点,将同一变量的冲突关系分摊到多个网上,虽增加冲突图节点数量,但降低节点度数,使得用更少颜色对冲突图着色,即可减少所需寄存器的数量,生成更加高效的可执行代码,使存器分配更为灵活。  相似文献   

13.
云端计算可以充分聚合Internet网络服务器端和边缘终端节点的计算资源来获得更大的效益。但将计算任务部署到用户终端上执行却带来了安全隐患。分属于不同用户的海量终端节点之行为显然不可靠,计算安全性也难以保障。特别是作为任务执行者的用户终端节点可能篡改任务中的程序代码或数据,返回的是虚假的结果,或是窥探有私密性要求的代码和数据。提出一种新的基于内嵌验证码的加密函数的代码保护机制,它可同时满足计算完整性和私密性,能够有效验证返回结果的正确性,并保障计算代码不被窥知。为了进一步提高任务执行的成功率和缩短作业周转时间,将任务代码优先分发给信誉良好且执行成功率高的节点来执行。还提出了一种评估任务执行节点可信性的方法。具体描述了任务执行代码保护机制的实现流程,并对机制的性能进行了详细的分析与验证。  相似文献   

14.
In this paper, we provide a method to safely store a document in perhaps the most challenging settings, a highly decentralized replicated storage system where up to half of the storage servers may incur arbitrary failures, including alterations to data stored in them. Using an error correcting code (ECC), e.g., a Reed–Solomon code, one can take n pieces of a document, replace each piece with another piece of size larger by a factor of such that it is possible to recover the original set even when up to t of the larger pieces are altered. For t close to n/2 the space blowup factor of this scheme is close to n, and the overhead of an ECC such as the Reed–Solomon code degenerates to that of a trivial replication code. We show a technique to reduce this large space overhead for high values of t. Our scheme blows up each piece by a factor slightly larger than two using an erasure code which makes it possible to recover the original set using n/2−O(n/d) of the pieces, where d≈80 is a fixed constant. Then we attach to each piece O(d log n/log d) additional bits to make it possible to identify a large enough set of unmodified pieces, with negligible error probability, assuming that at least half the pieces are unmodified and with low complexity. For values of t close to n/2 we achieve a large asymptotic space reduction over the best possible space blowup of any ECC in deterministic setting. Our approach makes use of a d-regular expander graph to compute the bits required for the identification of n/2−O(n/d) good pieces.  相似文献   

15.
The star graph is viewed as an attractive alternative to the hypercube. In this paper, we investigate the Hamiltonicity of an n-dimensional star graph. We show that for any n-dimensional star graph (n≥4) with at most 3n−10 faulty edges in which each node is incident with at least two fault-free edges, there exists a fault-free Hamiltonian cycle. Our result improves on the previously best known result for the case where the number of tolerable faulty edges is bounded by 2n−7. We also demonstrate that our result is optimal with respect to the worst case scenario, where every other node of a cycle of length 6 is incident with exactly n−3 faulty noncycle edges.  相似文献   

16.
The bounds on f(n,k), the number of faulty nodes to make every (nk)-dimensional substar Snk in an n-dimensional star network Sn, have been derived. The exact value for f(n,k) is determined when n is prime and k=2, or when n−2?k?n. For 2<k<n−2, a general method is presented to derive a set of faulty nodes which damage all Snk's in Sn.  相似文献   

17.
We first define the basic notions of local and non-local tasks for distributed systems. Intuitively, a task is local if, in a system with no failures, each process can compute its output value locally by applying some local function on its own input value (so the output value of each process depends only on the process’ own input value, not on the input values of the other processes); a task is nonlocal otherwise. All the interesting distributed tasks, including all those that have been investigated in the literature (e.g., consensus, set agreement, renaming, atomic commit, etc.) are non-local. In this paper we consider non-local tasks and determine the minimum information about failures that is necessary to solve such tasks in message-passing distributed systems. As part of this work, we also introduces weak set agreement—a natural weakening of set agreement—and show that, in some precise sense, it is the weakest nonlocal task in message-passing systems.  相似文献   

18.
Constructions of woven graph codes based on constituent convolutional codes are studied, and examples of woven convolutional graph codes are presented. Existence of codes satisfying the Costello lower bound on the free distance within a random ensemble of woven graph codes based on s-partite, s-uniform hypergraphs is shown, where s depends only on the code rate. Simulation results for Viterbi decoding of woven graph codes are presented and discussed.  相似文献   

19.
We consider the problem of deciding if there is a feasible preemptive schedule for a set of n independent tasks with release times and deadlines on m identical processors. The general problem is known to be solvable in O(n 3) time. In this paper, we study special cases for which faster algorithms exist. We introduce the notion of monotone schedules and study their properties. These properties are then exploited to devise fast algorithms for two special cases—the nested task systems and the non-overlapping task systems. We give an O(n log mn) time algorithm and an O(n log n+mn) time algorithm for nested task systems and non-overlapping task systems, respectively. Our algorithms generate at most O(n) and O(mn) preemptions for nested task systems and nonoverlapping task systems, respectively.Research supported in part by the ONR grant N00014-87-K-0833.  相似文献   

20.
对智能规划中的常用工具——放松式规划图(relaxed planning graph,简称RPG)的图论性质进行了深入研究.将RPG中的命题层抽取出来,得到一个不包含任何动作的命题关系图(proposition relation graph,简称PRG),发现PRG仍具有RPG的主要规划性质.初步研究结果包括以下4个方面:初始命题集(initial proposition set,简称IPS)的闭出邻集(close out-neighborhoods,简称CON)是放松式规划可达命题集(relaxed reachable proposition set,简称R-RPS);初始状态命题到目标状态命题的最大距离是规划解长度的合理估计;无圈序指出了对应命题被实现的顺序要求;出度或入度为1的结点收缩对应规划中构造的宏动作.上述结果中,前两者说明PRG保留RPG的主要规划性质,后两者可用于建立目标议程或宏动作提取等领域.还提出与上述结论相关的3种算法:从RPG中得到PRG的算法(复杂性为O(mn2),其中,n为RPG的命题数,m为RPG的动作数);约简无圈序算法(复杂性为O(n+m),其中,n为PRG的结点数,m为PRG的边数);宏动作建议算法(复杂性为O(n2),n为PRG的结点数).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号