Performing tasks on synchronous restartable message-passing processors |
| |
Authors: | Bogdan S Chlebus Roberto De Prisco Alex A Shvartsman |
| |
Affiliation: | (1) Instytut Informatyki, Uniwersytet Warszawski, ul. Banacha 2, 02-097 Warszawa, Poland (e-mail: chlebus@mimuw.edu.pl) , PL;(2) Laboratory for Computer Science, Massachusetts Institute of Technology, 545 Technology Square, NE43-316 Cambridge, MA 02139, USA (e-mail: robdep@theory.lcs.mit.edu, alex@theory.lcs.mit.edu) , US;(3) Dipartimento di Informatica ed Applicazioni, University of Salerno, 84081 Baronissi (SA), Italy , IT;(4) Department of Computer Science and Engineering, University of Connecticut, 191 Auditorium Road, Unit 155, Storrs, CT 06269, USA , US |
| |
Abstract: | Summary. This work considers the problem of performing t tasks in a distributed system of p fault-prone processors. This problem, called do-all herein, was introduced by Dwork, Halpern and Waarts. The solutions presented here are for the model of computation that abstracts
a synchronous message-passing distributed system with processor stop-failures and restarts. We present two new algorithms
based on a new aggressive coordination paradigm by which multiple coordinators may be active as the result of failures. The
first algorithm is tolerant of stop-failures and does not allow restarts. Its available processor steps (work) complexity is and its message complexity is . Unlike prior solutions, our algorithm uses redundant broadcasts when encountering failures and, for p =t and largef, it achieves better work complexity. This algorithm is used as the basis for another algorithm that tolerates stop-failures
and restarts. This new algorithm is the first solution for the do-all problem that efficiently deals with processor restarts. Its available processor steps is , and its message complexity is , wheref is the total number of failures.
Received: October 1998 / Accepted: September 2000 |
| |
Keywords: | :Fault-tolerance – Distributed systems – Load balancing – Processor restarts – Work |
本文献已被 SpringerLink 等数据库收录! |
|