首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 140 毫秒
1.
Efficient algorithms for optimistic crash recovery   总被引:1,自引:0,他引:1  
Summary Recovery from transient processor failures can be achieved by using optimistic message logging and checkpointing. The faulty processorsroll back, and some/all of the non-faulty processors also may have to roll back. This paper formulates the rollback problem as a closure problem. A centralized closure algorithm is presented together with two efficient distributed implementations. Several related problems are also considered and distributed algorithms are presented for solving them. S. Venkatesan received the B. Tech. and M. Tech degrees from the Indian Institute of Technology, Madras in 1981 and 1983, respectively and the M.S. and Ph.D. degrees in Computer Science from the University of Pittsburgh in 1985 and 1988. He joined the University of Texas at Dallas in January 1989, where he is currently an Assistant Professor of Computer Science. His research interests are in fault-tolerant distributed systems, distributed algorithms, testing and debugging distributed programs, fault-tolerant telecommunication networks, and mobile computing. Tony Tony-Ying Juang is an Associate Professor of Computer Science at the Chung-Hwa Polytechnic Institute. He received the B.S. degree in Naval Architecture from the National Taiwan University in 1983 and his M.S. and Ph.D. degrees in Computer Science from the University of Texas at Dallas in 1989 and 1992, respectively. His research interests include distributed algorithms, fault-tolerant distributed computing, distributed operating systems and computer communications.This research was supported in part by NSF under Grant No. CCR-9110177 and by the Texas Advanced Technology Program under Grant No. 9741-036  相似文献   

2.
Dynamic group communication   总被引:1,自引:0,他引:1  
Group communication is the basic infrastructure for implementing fault-tolerant replicated servers. While group communication is well understood in the context of static groups (in which the membership does not change), current specifications of dynamic group communication (in which processes can join and leave groups during the computation) have not yet reached the same level of maturity. The paper proposes new specifications – in the primary partition model – for dynamic reliable broadcast (simply called “reliable multicast”), dynamic atomic broadcast (simply called “atomic multicast”) and group membership. In the special case of a static system, the new specifications are identical to the well known static specifications. Interestingly, not only are these new specifications “syntactically” close to the static specifications, but they are also “semantically” close to the dynamic specifications proposed in the literature. We believe that this should contribute to clarify a topic that has always been difficult to understand by outsiders. Finally, the paper shows how to solve atomic multicast, group membership and reliable broadcast. The solution of atomic multicast is close to the (static) atomic broadcast solution based on reduction to consensus. Group membership is solved using atomic multicast. Reliable multicast can be efficiently solved by relying on a thrifty generic multicast algorithm. Andrée Schiper graduated in Physics from the ETHZ in Zurich in 1973 and received the PhD degree in Computer Science from the EPFL (Federal Institute of Technology in Lausanne, Switzerland) in 1980. He has been a professor of computer science at EPFL since 1985, leading the Distributed Systems Laboratory. During the academic year 1992–1993, he was on sabbatical leave at the University of Cornell, Ithaca, New York, and in 2004-2005 at the Ecole Polytechnique near Paris. His research interests are in the area of dependable distributed systems, middleware support for dependable systems, replication techniques (including for database systems), group communication, distributed transactions, and, recently MANETs (mobile ad-hoc networks). From 2000 to 2002, he was the chair of the steering committee of the International Symposium on Distributed Computing (DISC). He has taken part in several European projects. He is currently a member of the editorial board of Distributed Computing, and of IEEE Transactions on Dependable and Secure Computing.  相似文献   

3.
Summary The abstraction of a shared memory is of growing importance in distributed computing systems. Traditional memory consistency ensures that all processes agree on a common order of all operations on memory. Unfortunately, providing these guarantees entails access latencies that prevent scaling to large systems. This paper weakens such guarantees by definingcausal memory, an abstraction that ensures that processes in a system agree on the relative ordering of operations that arecausally related. Because causal memory isweakly consistent, it admits more executions, and hence more concurrency, than either atomic or sequentially consistent memories. This paper provides a formal definition of causal memory and gives an implementation for message-passing systems. In addition, it describes a practical class of programs that, if developed for a strongly consistent memory, run correctly with causal memory. Mustaque Ahamad is an Associate Professor in the College of Computing at the Georgia Institute of Technology. He received his M.S. and Ph.D. degrees in Computer Science from the State University of New York at Stony Brook in 1983 and 1985 respectively. His research interests include distributed operating systems, consistency of shared information in large scale distributed systems, and replicated data systems. James E. Burns received the B.S. degree in mathematics from the California Institute of Technology, the M.B.I.S. degree from Georgia State University, and the M.S. and Ph.D. degrees in information and computer science from the Georgia Institute of Technology. He served on the faculty of Computer Science at Indiana University and the College of Computing at the Georgia Institute of Technology before joining Bellcore in 1993. He is currently a Member of Technical Staff in the Network Control Research Department, where he is studying the telephone control network with special interest in behavior when faults occur. He also has research interests in theoretical issues of distributed and parallel computing especially relating to problems of synchronization and fault tolerance.This work was supported in part by the National Science Foundation under grants CCR-8619886, CCR-8909663, CCR-9106627, and CCR-9301454. Parts of this paper appeared in S. Toueg, P.G. Spirakis, and L. Kirousis, editors,Proceedings of the Fifth International Workshop on Distributed Algorithms, volume 579 ofLecture Notes on Computer Science, pages 9–30, Springer-Verlag, October 1991The photograph of Professor J.E. Burns was published in Volume 8, No. 2, 1994 on page 59This author's contributions were made while he was a graduate student at the Georgia Institute of Technology. No photograph and biographical information is available for P.W. Hutto Gil Neiger was born on February 19, 1957 in New York, New York. In June 1979, he received an A.B. in Mathematics and Psycholinguistics from Brown University in Providence, Rhode Island. In February 1985, he spent two weeks picking cotton in Nicaragua in a brigade of international volunteers. In January 1986, he received an M.S. in Computer Science from Cornell University in Ithaca, New York and, in August 1988, he received a Ph.D. in Computer Science, also from Cornell University. On August 20, 1988, Dr. Neiger married Hilary Lombard in Lansing, New York. He is currently a Staff Software Engineer at Intel's Software Technology Lab in Hillsboro, Oregon. Dr. Neiger is a member of the editorial boards of theChicago Journal of Theoretical Computer Science and theJournal of Parallel and Distributed Computing.  相似文献   

4.
A distributed system can support fault-tolerant applications by replicating data and computation at nodes that have independent failure modes. We present a scheme called parallel execution threads (PET) which can be used to implement fault-tolerant computations in an object-based distributed system. In a system that replicates objects, the PET scheme can be used to replicate a computation by creating a number of parallel threads which execute with different replicas of the invoked objects. A computation can be completed successfully if at least one thread does not encounter any failed nodes and its completion preserves the consistency of the objects. The PET scheme can tolerate failures that occur during the execution of the computation as long as all threads are not affected by the failures. We present the algorithms required to implement the PET scheme and also address some performance issues. Mustaque Ahamad received his B.E. (Hons.) degree in Electrical Engineering from the Birla Institute of Technology and Science, Pilani, India. He obtained his M.S. and Ph.D. degrees in Computer Science from the State University of New York at Stony Brook in 1983 and 1985 respectively. Since September 1985, he is an Assistant Professor in the School of Information and Computer Science at the Georgia Institute of Technology, Atlanta. His research interests include distributed operating systems, distributed algorithms, faulttolerant systems and performance evaluation. Partha Dasgupta is an Assistant Professor at Georgia Tech since 1984. He has a Ph.D. in Computer Science from the State University of New York at Stony Brook. He is the technical project director of the Clouds distributed operating systems project, as well as a coprincipal investigator of Georgia Tech's NSF-CER award. His research interests include building distributed operating systems, distributed algorithms, fault-tolerant systems and distributed programming support. Richard J. LeBlanc, Jr. received the B.S. degree in physics from Louisiana State University in 1972 and the M.S. and Ph.D. degrees in computer sciences from the University of Wisconsin-Madison in 1974 and 1977, respectively. He is currently a Professor in the School of Information and Computer Science of the Georgia Institute of Technology. His research interests include programming language design and implementation, programming environments, and software engineering. Dr. LeBlanc's current research work involves application of these interests in distributed processing systems. As co-director of the Clouds Project, he is studying language concepts and software engineering methodology for utilizing a highly reliable, object-based distributed system. He is also interested in specification-based software development methodologies and tools. Dr. LeBlanc is a member of the Association for Computing Machinery, the IEEE Computer Society and Sigma Xi.This work was supported in part by NSF grants CCR-8619886 and CCR-8806358, and RADC contract number F30602-86-C-0032  相似文献   

5.
6.
Summary This paper focuses upon a particular conservative algorithm for parallel simulation, the Time of Next Event (TNE) suite of algorithms [13]. TNE relies upon a shortest path algorithm which is independently executed on each processor in order to unblock LPs in the processor and to increase the parallelism of the simulation. TNE differs fundamentally from other conservative approaches in that it takes advantage of having several LPs assigned to each processor, and does not rely upon message passing to provide lookahead. Instead, it relies upon a shortest path algorithm executed independently in each processor. A deadlock resolution algorithm is employed for interprocessor deadlocks. We describe an empirical investigation of the performance of TNE on the iPSC/i860 hypercube multiprocessor. Several factors which play an important role in TNE's behavior are identified, and the speedup relative to a fast uniprocessor-based event list algorithm is reported. Our results indicate that TNE yields good speedups and out-performs an optimized version of the Chandy&Misra-null message (CMB) algorithm. TNE was 2–5 times as fast as the CM approach for less than 10 processors (and 1.5–3 times as fast when more than 10 processors were used for the same population of processes.) Azzedine Boukerche received the State Engineer degree in Software Engineering from Oran University, Oran, Algeria, and the M.Sc. degree in Computer Science from McGill University, Montreal, Canada. He is a Ph.D. candidate at the School of Computer Science, McGill University. During 1991–1992, he was a visiting doctoral student at the California Institute of Technology. He is employed as a Faculty Lecturer of computer Science at McGill University since 1993. His research interests include parallel simulation, distributed algorithms, and system performance analysis. He is a student member of the IEEE and ACM. Carl Tropper is an Associate Professor of Computer Science at McGill University. His primary area of research is parallel discrete event simulation. His general area of interest is in parallel computing and distributed algorithms in particular. Previously, he has done research in the performance modeling of computer networks, having written a book,Local Computer Network Technologies, while active in the area. Before coming to university life, he worked for the BBN Corporation and the Mitre Corporation, both located in the Boston area. He spent the 1991–92 academic year on a sabbatical leave at the Jet Propulsion Laboratories of the California Institute of Technology where he contributed to a project centered about the verification of flight control software. As part of this project he developed algorithms for the parallel simulation of communicating finite state machines. During winters he may be found hurtling down mountains on skis.This work has been completed while the author was a visiting doctoral student at the California Institute of TechnologyWas on sabbatical leave at the Jet Propulsion laboratories, California Institute of Technology  相似文献   

7.
Recent progress in sensor technology, data processing and integrated actuators has made the development of miniature flying robots fully possible. Micro VTOL1 systems represent a useful class of flying robots because of their strong capabilities for small-area monitoring, building exploration and intervention in hostile environments. In this paper, we emphasize the importance of the VTOL vehicle as a candidate for the high-mobility system emergence. In addition, we describe the approach that our lab2 has taken to micro VTOL evolving towards autonomy and present the mechanical design, dynamic modelling, sensing, and control of our indoor VTOL autonomous robot OS4.3Samir Bouabdallah is research assistant and Ph.D. student at the Autonomous Systems Lab (ASL) at the Swiss Federal Institute of Technology, Lausanne, (EPFL). He got his Masters in Electrical Engineer from Abu Bakr Belkaid University (ABBU) Tlemcen, Algeria in 2001. His master thesis was the development of an autonomous mobile robot for academic research. His current research interests are control systems and design optimization of VTOL flying robots.Pierpaolo Murrieri is a Ph.D. student at the Centro Interdipartimentale E. Piaggio and Dipartimento Sistemi Elettrici ed Automazione (DSEA) at the University of Pisa. He got his Master in Electrical Engineer from University of Pisa in 2000. His master thesis was about the registration of biomedical images. His current research interests are mobile robotics, nonlinear control and artificial vision.Roland Siegwart is director of the Autonomous Systems Lab (ASL) at the Swiss Federal Institute of Technology Lausanne (EPFL). He received his Masters in Mechanical Engineering in 1983 and his Ph.D. in 1989 at the Swiss Federal Institute of Technology Zurich (ETH). In 1989/90 he spent one year as postdoc at Stanford University. From 1991 to 1996 he worked part time as R&D director at MECOS Traxler AG and as a lecturer and deputy head at the Institute of Robotics, ETH. In 1996 he joined EPFL as full professor where he is working in robotics and mechatronics, namely mobile robot navigation, space robotics, human-robot interaction, all terrain locomotion and micro-robotics. Roland Siegwart is member of various scientific committees and cofounder of several spin-off companies.  相似文献   

8.
Much progress has been made in distributed computing in the areas of distribution structure, open computing, fault tolerance, and security. Yet, writing distributed applications remains difficult because the programmer has to manage models of these areas explicitly. A major challenge is to integrate the four models into a coherent development platform. Such a platform should make it possible to cleanly separate an application’s functionality from the other four concerns. Concurrent constraint programming, an evolution of concurrent logic programming, has both the expressiveness and the formal foundation needed to attempt this integration. As a first step, we have designed and built a platform that separates an application’s functionality from its distribution structure. We have prototyped several collaborative tools with this platform, including a shared graphic editor whose design is presented in detail. The platform efficiently implements Distributed Oz, which extends the Oz language with constructs to express the distribution structure and with basic primitives for open computing, failure detection and handling, and resource control. Oz appears to the programmer as a concurrent object-oriented language with dataflow synchronization. Oz is based on a higher-order, state-aware, concurrent constraint computation model. Seif Haridi, Ph.D.: He received his Ph.D. in computer science in 1981 from the Royal Institute of Technology, Sweden. After spending 18 months at IBM T. J. Watson Research Center, he moved to the Swedish Institute of Computer Science (SICS) to form a research lab on logic programming and parallel systems. Dr. Haridi is currently the research director of the Swedish Institute of Computer Science. He has been an active researcher in the area of logic and constraint programming and parallel processing since the beginning of the eighties. His earlier work includes contributions to the design of SICStus Prolog, various parallel Prolog systems and a class of scalable cache-coherent multiprocessors known as Cache-Only Memory Architecture (COMA). During the nineties most of his work focused on the design of multiparadigm programming systems based on Concurrent Constraint Programming (CCP). Currently, he is interested in programming systems and software methodology for distributed and agent-based applications. Peter Van Roy, Ph.D.: He obtained an engineering degree from the Vrije Universiteit Brussel (1983), Masters and Ph.D. degrees from the University of California at Berkeley (1984, 1990), and the Habilitation à Diriger des Recherches from Paris VII Denis Diderot (1996). He has made major contributions to logic language implementation. His research showed for the first time that Prolog can be implemented with the same execution efficiency as C. He was principal developer or codeveloper of Aquarius Prolog, Wild_Life, Logical State Threads, and FractaSketch. He joined the Oz project in 1994 and is currently working on Distributed Oz. His research interests are motivated by the desire to provide increased expressivity and efficiency to application developers. Per Brand: He is a researcher at the Swedish Institute of Computer Science. He has previously worked on the design and implementation of OR-parallel Prolog (the Aurora project) and optimized compilation techniques for Concurrent Constraint Programming Languages (in particular, AKL). He has been a member of the Distributed Oz design team since the project began. His research interests are focused on techniques, languages, and methodology for distributed programming. Christian Schulte: He studied computer science at the University of Karlsruhe, Germany, from 1987 to 1992 where he received his diploma. Since 1992 he has been a member of the Programming Systems Lab at DFKI. He is one of the principal designers of Oz. His research interests include design, implementation, and application of concurrent and distributed programming languages as well as constraint programming.  相似文献   

9.
The Multi-Agent Distributed Goal Satisfaction (MADGS) system facilitates distributed mission planning and execution in complex dynamic environments with a focus on distributed goal planning and satisfaction and mixed-initiative interactions with the human user. By understanding the fundamental technical challenges faced by our commanders on and off the battlefield, we can help ease the burden of decision-making. MADGS lays the foundations for retrieving, analyzing, synthesizing, and disseminating information to commanders. In this paper, we present an overview of the MADGS architecture and discuss the key components that formed our initial prototype and testbed. Eugene Santos, Jr. received the B.S. degree in mathematics and Computer science and the M.S. degree in mathematics (specializing in numerical analysis) from Youngstown State University, Youngstown, OH, in 1985 and 1986, respectively, and the Sc.M. and Ph.D. degrees in computer science from Brown University, Providence, RI, in 1988 and 1992, respectively. He is currently a Professor of Engineering at the Thayer School of Engineering, Dartmouth College, Hanover, NH, and Director of the Distributed Information and Intelligence Analysis Group (DI2AG). Previously, he was faculty at the Air Force Institute of Technology, Wright-Patterson AFB and the University of Connecticut, Storrs, CT. He has over 130 refereed technical publications and specializes in modern statistical and probabilistic methods with applications to intelligent systems, multi-agent systems, uncertain reasoning, planning and optimization, and decision science. Most recently, he has pioneered new research on user and adversarial behavioral modeling. He is an Associate Editor for the IEEE Transactions on Systems, Man, and Cybernetics: Part B and the International Journal of Image and Graphics. Scott DeLoach is currently an Associate Professor in the Department of Computing and Information Sciences at Kansas State University. His current research interests include autonomous cooperative robotics, adaptive multiagent systems, and agent-oriented software engineering. Prior to coming to Kansas State, Dr. DeLoach spent 20 years in the US Air Force, with his last assignment being as an Assistant Professor of Computer Science and Engineering at the Air Force Institute of Technology. Dr. DeLoach received his BS in Computer Engineering from Iowa State University in 1982 and his MS and PhD in Computer Engineering from the Air Force Institute of Technology in 1987 and 1996. Michael T. Cox is a senior scientist in the Intelligent Distributing Computing Department of BBN Technologies, Cambridge, MA. Previous to this position, Dr. Cox was an assistant professor in the Department of Computer Science & Engineering at Wright State University, Dayton, Ohio, where he was the director of Wright State’s Collaboration and Cognition Laboratory. He received his Ph.D. in Computer Science from the Georgia Institute of Technology, Atlanta, in 1996 and his undergraduate from the same in 1986. From 1996 to 1998, he was a postdoctoral fellow in the Computer Science Department at Carnegie Mellon University in Pittsburgh working on the PRODIGY project. His research interests include case-based reasoning, collaborative mixed-initiative planning, intelligent agents, understanding (situation assessment), introspection, and learning. More specifically, he is interested in how goals interact with and influence these broader cognitive processes. His approach to research follows both artificial intelligence and cognitive science directions.  相似文献   

10.
Summary The problem of fault-tolerant agreement is fundamental to distributed computing. When agreement is to be reached in spite of arbitrary behavior by faulty processors, this problem is calledDistributed Consensus. By requiring that the number of faulty processors be , wheren is the number of processors in the system, we are able to derive two new protocols forDistributed Consensus. Both are simple and use messages that are only one bit in length, and both provide forearly stopping: the fewer failures there are, the fewer rounds of communication are required. One protocol is optimal with respect to the number of rounds of communication required, and the other is asymptotically optimal with respect to the total number of message bits exchanged. James E. Burns received the B.S. degree in mathematics from the California Institute of Technology, the M.B.I.S. degree from Georgia State University, and the M.S. and Ph.D. degrees in information and computer science from the Georgia Institute of Technology. He served on the faculty of Computer Science at Indiana University and the College of Computing at the Georgia Institute of Technology before joining Bellcore in 1993. He is currently a Member of Technical Staff in the Network Control Research Department, where he is studying the telephone control network with special interest in behavior when faults occur. He also has research interests in theoretical issues of distributed and parallel computing, especially relating to problems of synchronization and fault tolerance. Gil Neiger was born on February 19, 1957 in New York, New York. In June 1979, he received an A.B. in Mathematics and Psycholinguistics from Brown University in Proidence, Rhode Island. In February 1985, he spent two weeks picking cotton in Nicaragua in a brigade of international volunteers. In January 1986, he received an M.S. in Computer Science from Cornell University in Ithaca, New York and, in August 1988, he received a Ph.D. in Computer Science, also from Cornell University. On August 20, 1988, Dr. Neiger married Hilary Lombard in Lansing, New York. Since August 1988, he has been an Assistant Professor in the College of Computing (formely School of Information and Computer Science) at the Georgia Institute of Technology in Atlanta, Georgia. Dr. Neiger is a member of the editorial board of theChicago Journal of Theoretical Computer Science and theJournal of Parallel and Distributed Computing.This author was supported in part by the National Science Foundation under grants CCR-8909663, CCR-9106627, and CCR-9301454.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号