期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Communication-efficient parallel algorithms for distributed random-access machines

Charles E. Leiserson Bruce M. Maggs 《Algorithmica》1988,3(1):53-77

This paper introduces a model for parallel computation, called thedistributed randomaccess machine (DRAM), in which the communication requirements of parallel algorithms can be evaluated. A DRAM is an abstraction of a parallel computer in which memory accesses are implemented by routing messages through a communication network. A DRAM explicitly models the congestion of messages across cuts of the network.We introduce the notion of aconservative algorithm as one whose communication requirements at each step can be bounded by the congestion of pointers of the input data structure across cuts of a DRAM. We give a simple lemma that shows how to shortcut pointers in a data structure so that remote processors can communicate without causing undue congestion. We giveO(lgn)-step, linear-processor, linear-space, conservative algorithms for a variety of problems onn-node trees, such as computing treewalk numberings, finding the separator of a tree, and evaluating all subexpressions in an expression tree. We giveO(lg² n)-step, linear-processor, linear-space, conservative algorithms for problems on graphs of sizen, including finding a minimum-cost spanning forest, computing biconnected components, and constructing an Eulerian cycle. Most of these algorithms use as a subroutine a generalization of the prefix computation to trees. We show that any suchtreefix computation can be performed inO(lgn) steps using a conservative variant of Miller and Reif's tree-contraction technique.This research was supported in part by the Defense Advanced Research Projects Agency under Contract N00014-80-C-0622 and by the Office of Naval Research under Contract N00014-86-K-0593. Charles Leiserson is supported in part by an NSF Presidential Young Investigator Award with matching funds provided by AT&T Bell Laboratories and Xerox Corporation. Bruce Maggs is supported in part by an NSF Fellowship. 相似文献

2.

Communication-efficient algorithms for parallel latent Dirichlet allocation

Jian-Feng Yan Jia Zeng Yang Gao Zhi-Qiang Liu 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2015,19(1):3-11

相似文献

3.

Line-drawing algorithms for parallel machines

Pang A.T. 《Computer Graphics and Applications, IEEE》1990,10(5):54-59

The fact that conventional line-drawing algorithms, when applied directly on parallel machines, can lead to very inefficient codes is addressed. It is suggested that instead of modifying an existing algorithm for a parallel machine, a more efficient implementation can be produced by going back to the invariants in the definition. Popular line-drawing algorithms are compared with two alternatives; distance to a line (a point is on the line if sufficiently close to it) and intersection with a line (a point on the line if an intersection point). For massively parallel single-instruction-multiple-data (SIMD) machines (with thousands of processors and up), the alternatives provide viable line-drawing algorithms. Because of the pixel-per-processor mapping, their performance is independent of the line length orientation 相似文献

4.

Communication-efficient distributed oblivious transfer

Amos Beimel Yeow Meng Chee Huaxiong Wang Liang Feng Zhang 《Journal of Computer and System Sciences》2012,78(4):1142-1157

相似文献

5.

带两个服务等级的3台机半在线算法

肖满丁璐张怡《计算机工程与科学》2020,42(12):2252-2258

This paper studies a semi-online hierarchical scheduling problem on three identical machines. In the problem, there is only one machine with hierarchy 1 and two machines with hierarchy 2, and the goal is to minimize the makespan. When the total size of low-hierarchy is known, an online algorithm with the competitive ratio of 5/3 and the lower bound of 3/2 is given. When the total size of high-hierarchy is known, an online algorithm with the competitive ratio of 9/5 and the lower bound of 3/2 is given. When the total size of each hierarchy is known, an online algorithm with the competitive ratio of 3/2 and the lower bound of 4/3 is given. When the total size of jobs is known, a best possible online algorithm with the competitive ratio of 3/2 is given. 相似文献

6.

A massively parallel architecture for distributed genetic algorithms

《Parallel Computing》2004,30(5-6):647-676

相似文献

7.

Semi-online algorithms for hierarchical scheduling on three parallel machines

XIAO Man DING Lu ZHANG Yi 《计算机工程与科学》2021,42(12):2252

相似文献

8.

Efficient parallel algorithms for image template matching onhypercube SIMD machines

Prasanna K.V.K. Krishnan V. 《IEEE transactions on pattern analysis and machine intelligence》1989,11(6):665-669

Efficient parallel algorithms developed on hypercube SIMD (single-instruction multiple data-stream) machines for image template matching are presented. Most of these parallel algorithms are asymptotically optimal in their time complexities. These results improve the known bounds in the literature 相似文献

9.

Randomized truthful algorithms for scheduling selfish tasks on parallel machines

Eric Angel 《Theoretical computer science》2012,414(1):1-8

相似文献

10.

Dynamic slack allocation algorithms for energy minimization on parallel machines 总被引：1，自引：0，他引：1

Jaeyeon Kang Sanjay Ranka 《Journal of Parallel and Distributed Computing》2010

We explore novel algorithms for DVS (Dynamic Voltage Scaling) based energy minimization of DAG (Directed Acyclic Graph) based applications on parallel and distributed machines in dynamic environments. Static DVS algorithms for DAG execution use the estimated execution time. The estimated time in practice is overestimated or underestimated. Therefore, many tasks may be completed earlier or later than expected during the actual execution. For overestimation, the extra available slack can be added to future tasks so that energy requirements can be reduced. For underestimation, the increased time may cause the application to miss the deadline. Slack can be reduced for future tasks to reduce the possibility of not missing the deadline. In this paper, we present novel dynamic scheduling algorithms for reallocating the slack for future tasks to reduce energy and/or satisfy deadline constraints. Experimental results show that our algorithms are comparable to static algorithms applied at runtime in terms of energy minimization and deadline satisfaction, but require considerably smaller computational overhead. 相似文献

11.

New Heuristic Distributed Parallel Algorithms for Searching and Planning 总被引：2，自引：1，他引：1

下载免费PDF全文

Shuai Dianxun 《计算机科学技术学报》1995,10(4):354-374

This paper proposes new heuristic distributed parallel algorithms for searching and planning,which are based on the concepts of wave concurrent propagations and competitive activation mechanisms.These algorithms are characterized by simplicity and clearness of control strategies for earching,and distinguished abilities in many aspects,such as high speed processing,wide suitability for searching AND/OR implicit graphs,and ease in hardware implementation. 相似文献

12.

Execution time support for adaptive scientific algorithms on distributed memory machines

Harry Berryman Joel Saltz Jeffrey Scroggs 《Concurrency and Computation》1991,3(3):159-178

We consider optimizations that are required for efficient execution of code segments that consist of loops over distributed data structures. The PARTI execution time primitives are designed to perform these optimizations and can be used to Implement a wide range of scientific algorithms on distributed memory machines. These primitives allow the user to control array mappings in a way that gives an appearance of shared memory. Computations can be based on a global index set. Primitives are used to perform gather and scatter operations on distributed arrays. Communications patterns are derived at run time, and the appropriate send and receive messages are automatically generated. 相似文献

13.

Dynamic programming algorithms for scheduling parallel machines with family setup times

《Computers & Operations Research》2001,28(2):127-137

We address the problem of scheduling jobs with family setup times on identical parallel machines to minimize total weighted flowtime. We present two dynamic programming algorithms — a backward algorithm and a forward algorithm — and we identify characteristics of problems where each algorithm is best suited. We also derive two properties that improve the computational efficiency of the algorithms.Scope and purposeWhile most production schedulers must balance conflicting goals of high system efficiency and timely completion of individual jobs, consideration of this conflict is underdeveloped in the scheduling literature. This paper examines a model that incorporates a fundamental cause of the efficiency/timeliness conflict in practice. We propose solution methodologies and properties of an optimal solution for the purpose of exposing insights that may ultimately be useful in research on more complex models. 相似文献

14.

Online algorithms for scheduling two parallel machines with a single server

下载免费PDF全文

Yiwei Jiang Feng Yu Ping Zhou Jueliang Hu 《International Transactions in Operational Research》2015,22(5):913-927

We consider an online scheduling problem on two identical parallel machines with a single server. Jobs arrive one by one and each job has to be loaded by the server before being processed on one of the machines, and unloaded immediately by the server after its processing. Both loading and unloading times are equal to one time unit. The goal is to minimize the makespan. For the variant of the problem involving both loading and unloading operations, we present an online algorithm with competitive ratio of 5/3. For the variant with loading operation only, we show that the competitive ratio of list scheduling is at least 8/5 and provide an improved online algorithm with competitive ratio of 11/7. Finally, we discuss the lower bounds for these problems. We show that both variants have a lower bound of 3/2. Furthermore, we show that the lower bound of the first variant is at least 8/5 if the online algorithm satisfies a certain constraint. 相似文献

15.

Massively parallel implementation of a fast multipole method for distributed memory machines

《Journal of Parallel and Distributed Computing》2005,65(7):870-881

相似文献

16.

Generating subproblems in branch and bound algorithms for parallel machines scheduling problem 总被引：1，自引：0，他引：1

Sang-Oh Shim 《Computers & Industrial Engineering》2009,57(3):1150-1153

A branch and bound algorithm (B&B) has been widely used in various discrete and combinatorial optimization fields. To obtain optimal solutions as soon as possible for scheduling problems, three tools, which are branching, bounding and dominance rules, have been developed in the B&B algorithm. One of these tools, a branching is a method for generating subproblems and directly determines size of solution to be searched in the B&B algorithm. Therefore, it is very important to devise effective branching scheme for the problem.In this note, a survey of branching schemes is performed for parallel machines scheduling (PMS) problems with n independent jobs and m machines and new branching schemes that can be used for identical and unrelated PMS problems, respectively, are suggested. The suggested branching methods show that numbers of generated subproblems are much smaller than that of other methods developed earlier and therefore, it is expected that they help to reduce a lot of CPU time required to obtain optimal solutions in the B&B algorithm. 相似文献

17.

Building with ParadisEO reusable parallel and distributed evolutionary algorithms

《Parallel Computing》2004,30(5-6):677-697

Numerous parallel and distributed evolutionary algorithms (PDEAs) and their implementations have been proposed and are available on the Web. A robust approach to make easier their code and design reuse is the framework approach. In this paper, we present some existing frameworks for PDEAs and their development requirements, and propose a new C++ open source framework, named Parallel and distributed Evolving Objects (ParadisEO). ParadisEO is basically devoted to the reusable and flexible design of parallel and distributed metaheuristics, but we focus here only on PDEAs. Compared to other related frameworks, ParadisEO allows more reuse flexibility, and provides more implemented parallel and distributed models. Furthermore, these models can be exploited by the user in a transparent way, and deployed as well on shared memory multi-processors as on distributed memory machines. The architecture has been experimented on two real-world applications: the radio network design and the spectroscopic data mining. The experimental results demonstrate the efficiency and robustness of the different models. 相似文献

18.

Communication-efficient distributed multi-task learning with matrix sparsity regularization

Zhou Qiang Chen Yu Pan Sinno Jialin 《Machine Learning》2020,109(3):569-601

Machine Learning - This work focuses on distributed optimization for multi-task learning with matrix sparsity regularization. We propose a fast communication-efficient distributed optimization... 相似文献

19.

多核机群上通信高效的整数序列并行排序方法

柯琦钟诚陈清媛陆向艳《计算机应用》2013,33(3):821-824

建立一个适用于整数序列排序的数据分配模型,在多核计算节点组成的异构机群上设计通信高效的整数序列并行算法。所提出的数据分配模型依据机群中各节点不同的计算能力、通信速率和存储容量,动态计算出调度分配给各节点的数据块的大小以平衡各个节点的负载。所设计的并行排序算法利用整数序列的特性,主节点采取两轮分发数据与接收结果的方法,从节点运用分桶打包方式返回有序的整数子序列给主节点,主节点采用桶映射方法将各个有序子序列直接整合成最终有序序列,以减少需要耗费较多通信时间的数据归并操作。分析与实验测试结果表明,给出的多核机群上的整数序列并行排序算法高效,具有良好的可扩展性。相似文献

20.

Vision algorithms for hypercube machines

《Journal of Parallel and Distributed Computing》1987,4(1):79-94

Several commercial hypercube parallel processors with the potential to deliver massive parallelism cost-effectively have been announced recently. They open the door to a wide variety of application areas that could benefit from parallelism. Computer vision is one of these application areas. This paper develops a general model for hypercube machines, and uses it to show how vision algorithms can be executed on hypercubes. In particular, the steps in the problem of thick-film inspection are used as a concrete example. The time needed to complete a typical inspection is used to demonstrate the performance of hypercube machines. Experimental results from a hypercube machine illustrate the potential use of such machines. 相似文献