首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper demonstrates that the fault tolerance of distributed control systems (DCSs) can be improved by scheduling of processes representing functional segments with guaranteed operation of the mechanisms of process reexecution and parallel execution based on checkpoints. Moreover, we suggest the methodological approach to assessing the fault tolerance level of DCSs, which proceeds from the probabilistic modeling of systems having the time triggered architecture (TTA). Finally, we derive numerical formulas for qualitative and quantitative estimation of the fault tolerance level for different modifications of DCSs at the design stage.  相似文献   

2.
A scalable model and methods of resource co-allocation to organize data processing in distributed systems by families of basic plans—strategies—are proposed. The character of strategies is multilevel since they are designed for structurally different but functionally equivalent models of the same job which is a complex set of interrelated tasks. A concrete basic plan of computations is selected depending on time parameters of control events that occur in the system and are related first of all to the load and dynamics of the composition of heterogeneous computational nodes.  相似文献   

3.
A version control mechanism is proposed that enhances the modularity and extensibility of multiversion concurrency control algorithms. The multiversion algorithms are decoupled into two components: version control and concurrency control. This permits modular development of multiversion protocols and simplifies the task of proving the correctness of these protocols. A set of procedures for version control is described that defines the interface with the version control component. It is shown that the same interface can be used by the database actions of both two-phase locking and time-stamp concurrency control protocols to access multiversion data. An interesting feature of the framework is that the execution of read-only transactions becomes completely independent of the underlying concurrency control implementation. Unlike other multiversion algorithms, read-only transactions in this scheme do not modify any version-related information, and therefore do not interfere with the execution of read-write transactions. The extension of the multiversion algorithms to a distributed environment becomes very simple  相似文献   

4.
Journal of Scheduling - We present a Work Stealing scheduling algorithm that provably avoids most synchronization overheads by keeping processors’ deques entirely private by default and only...  相似文献   

5.
We investigate the possibility of solving problems in completely asynchronous message passing systems where a number of processes may fail prior to execution. By using game-theoretical notions, necessary and sufficient conditions are provided for solving problems is such a model with an without a termination requirement. An upper bound on the message complexity for solving any problem in the model is given, as well as a simple design concept for constructing a solution to any solvable problem.Supported in part by the Guttwirth Fellowship, by the National Science Foundation under grant CCR-8405478, and by the Hebrew Technical Institute scholarship.Supported in part by Technion V.P.R. fund—C. Wellner Research fund.  相似文献   

6.
Summary. In a distributed system, high-level actions can be modeled by nonatomic events. This paper proposes causality relations between distributed nonatomic events and provides efficient testing conditions for the relations. The relations provide a fine-grained granularity to specify causality relations between distributed nonatomic events. The set of relations between nonatomic events is complete in first-order predicate logic, using only the causality relation between atomic events. For a pair of distributed nonatomic events X and Y, the evaluation of any of the causality relations requires integer comparisons, where and , respectively, are the number of nodes on which the two nonatomic events X and Y occur. In this paper, we show that this polynomial complexity of evaluation can by simplified to a linear complexity using properties of partial orders. Specifically, we show that most relations can be evaluated in integer comparisons, some in integer comparisons, and the others in integer comparisons. During the derivation of the efficient testing conditions, we also define special system execution prefixes associated with distributed nonatomic events and examine their knowledge-theoretic significance. Received: July 1997 / Accepted: May 1998  相似文献   

7.
This paper describes special aspects of MIMD parallelization in SUPERB. SUPERB is an interactive SIMD/MIMD parallelizing system for the SUPRENUM machine. The main topic of this paper is the updating of distributed variables in parallelized applications. The intended applications perform local computations on a large data domain.  相似文献   

8.
Multimedia synchronization is the essential technology for the integration of multimedia in distributed multimedia systems.The multimedia synchronization model has been recognized by many researchers as a premise of the implementation of multimedia synchronization.In distributed multimedia systems,the characteristic of multimedia synchronization is dynamic,and the key medium has the priority in multimedia synchronization.The previously proposed multimedia synchronization models cannot meet these requirements.So a new multimedia dynamic synchronization model-DSPN,based on the timed Petri-net has been designed in this paper.This model can not only let the distributed multimedia system keep multimedia synchronization in a more precise and effective manner according to the runtime situation of the system,but also allow the user to interact with the presentation of multimedia.  相似文献   

9.
A simple derivation of a general solution to the problem of detecting the termination of a distributed computation is presented.  相似文献   

10.
The fault-tolerance of distributed algorithms is investigated in asynchronous message passing systems with undetectable process failures. Two specific synchronization problems are considered, the dining philosophers problem and the binary committee coordination problem. The abstraction of a bounded doorway is introduced as a general mechanism for achieving individual progress and good failure locality. Using it as a building block, optimal fault-tolerant algorithms are constructed for the two problems  相似文献   

11.
In distributed computer systems, processors often need to be synchronized to maintain correctness and consistency. Unlike shared-memory parallel systems, the lack of shared memory and a clock considerably complicates the task of synchronization in distributed systems. The objective of this article is two-fold: (1) We present a new randomized agreement algorithm to synchronize cooperating processors in a distributed system. This algorithm achieves the desired agreement in expected five rounds of message exchanges, tolerating a maximum of one-fifth of the processors failures. The algorithm belongs to the class of broadcast-based synchronization problems. (2) We present a new self-stabilization algorithm for an acyclic directed-graph structured distributed systems. This new fault-tolerant algorithm survives all imaginable faults in distributed systems. The algorithm belongs to arbiter-based and broadcast-based synchronization problems.  相似文献   

12.
A distributed system can support fault-tolerant applications by replicating data and computation at nodes that have independent failure modes. We present a scheme called parallel execution threads (PET) which can be used to implement fault-tolerant computations in an object-based distributed system. In a system that replicates objects, the PET scheme can be used to replicate a computation by creating a number of parallel threads which execute with different replicas of the invoked objects. A computation can be completed successfully if at least one thread does not encounter any failed nodes and its completion preserves the consistency of the objects. The PET scheme can tolerate failures that occur during the execution of the computation as long as all threads are not affected by the failures. We present the algorithms required to implement the PET scheme and also address some performance issues. Mustaque Ahamad received his B.E. (Hons.) degree in Electrical Engineering from the Birla Institute of Technology and Science, Pilani, India. He obtained his M.S. and Ph.D. degrees in Computer Science from the State University of New York at Stony Brook in 1983 and 1985 respectively. Since September 1985, he is an Assistant Professor in the School of Information and Computer Science at the Georgia Institute of Technology, Atlanta. His research interests include distributed operating systems, distributed algorithms, faulttolerant systems and performance evaluation. Partha Dasgupta is an Assistant Professor at Georgia Tech since 1984. He has a Ph.D. in Computer Science from the State University of New York at Stony Brook. He is the technical project director of the Clouds distributed operating systems project, as well as a coprincipal investigator of Georgia Tech's NSF-CER award. His research interests include building distributed operating systems, distributed algorithms, fault-tolerant systems and distributed programming support. Richard J. LeBlanc, Jr. received the B.S. degree in physics from Louisiana State University in 1972 and the M.S. and Ph.D. degrees in computer sciences from the University of Wisconsin-Madison in 1974 and 1977, respectively. He is currently a Professor in the School of Information and Computer Science of the Georgia Institute of Technology. His research interests include programming language design and implementation, programming environments, and software engineering. Dr. LeBlanc's current research work involves application of these interests in distributed processing systems. As co-director of the Clouds Project, he is studying language concepts and software engineering methodology for utilizing a highly reliable, object-based distributed system. He is also interested in specification-based software development methodologies and tools. Dr. LeBlanc is a member of the Association for Computing Machinery, the IEEE Computer Society and Sigma Xi.This work was supported in part by NSF grants CCR-8619886 and CCR-8806358, and RADC contract number F30602-86-C-0032  相似文献   

13.
Summary. A useless checkpoint is a local checkpoint that cannot be part of a consistent global checkpoint. This paper addresses the following problem. Given a set of processes that take (basic) local checkpoints in an independent and unknown way, the problem is to design communication-induced checkpointing protocols that direct processes to take additional local (forced) checkpoints to ensure no local checkpoint is useless. The paper first proves two properties related to integer timestamps which are associated with each local checkpoint. The first property is a necessary and sufficient condition that these timestamps must satisfy for no checkpoint to be useless. The second property provides an easy timestamp-based determination of consistent global checkpoints. Then, a general communication-induced checkpointing protocol is proposed. This protocol, derived from the two previous properties, actually defines a family of timestamp-based communication-induced checkpointing protocols. It is shown that several existing checkpointing protocols for the same problem are particular instances of the general protocol. The design of this general protocol is motivated by the use of communication-induced checkpointing protocols in “consistent global checkpoint”-based distributed applications such as the detection of stable or unstable properties and the determination of distributed breakpoints. Received: July 1997 / Accepted: August 1999  相似文献   

14.
Fault-tolerant clock synchronization in distributed systems   总被引:2,自引:0,他引:2  
Ramanathan  P. Shin  K.G. Butler  R.W. 《Computer》1990,23(10):33-42
Existing fault-tolerant clock synchronization algorithms are compared and contrasted. These include the following: software synchronization algorithms, such as convergence-averaging, convergence-nonaveraging, and consistency algorithms, as well as probabilistic synchronization; hardware synchronization algorithms; and hybrid synchronization. The worst-case clock skews guaranteed by representative algorithms are compared, along with other important aspects such as time, message, and cost overhead imposed by the algorithms. More recent developments such as hardware-assisted software synchronization and algorithms for synchronizing large, partially connected distributed systems are especially emphasized  相似文献   

15.
Presents and analyzes a new probabilistic clock synchronization algorithm that can guarantee a much smaller bound on the clock skew than most existing algorithms. The algorithm is probabilistic in the sense that the bound on the clock skew that it guarantees has a probability of invalidity associated with it. However, the probability of invalidity may be made extremely small by transmitting a sufficient number of synchronization messages. It is shown that an upper bound on the probability of invalidity decreases exponentially with the number of synchronization messages transmitted. A closed-form expression that relates the probability of invalidity to the clock skew and the number of synchronization messages is also derived  相似文献   

16.
This paper presents a cost error measurement scheme and relaxed synchronization method, for simulated annealing on a distributed memory multicomputer, which predicts the amount of cost error that an algorithm will tolerate. An adaptive error control method is developed and implemented on an Intel iPSC/2  相似文献   

17.
A fully distributed and symmetric algorithm for solving the distributed termination problem is presented along with its correctness arguments. The algorithm does not make use of time-stamps and clock-synchronization and is very simple.  相似文献   

18.
分析分布式异构数据同步特征,设计一种基于面向服务架构(SOA)的分布异构数据同步模型(WLDSS),提出基于可用度测算的异构数据同步通信稳定控制策略解决方案,并对其系统性能加以分析。实验数据表明该模型所采用的策略能够主动适应各种系统性能特点,有效提高分布式异构数据同步效率。  相似文献   

19.
分布式多交互虚拟场景渲染的协同控制   总被引:1,自引:0,他引:1       下载免费PDF全文
针对分布式多交互虚拟现实系统场景渲染的协同控制问题,构建了基于分布式开放灵活的多交互虚拟现实系统结构,将协同交互技术集成到虚拟现实系统设计中,设计包含控制平台、网络服务平台和渲染平台的系统架构,提出了一种基于OGRE的分布式多交互实时协同渲染方法。完成了单个控制节点对应多个渲染节点时,多个渲染节点渲染场景的实时同步,以及多个控制节点在同一场景中的协同、交互。此研究成果应用于河北大学虚拟漫游交互控制平台,具有广阔应用的前景。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号