首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
When building dependable systems by integrating untrusted software components that were not originally designed to interact with each other, it is likely the occurrence of architectural mismatches related to assumptions in their failure behaviour. These mismatches, if not prevented during system design, have to be tolerated during runtime. This paper presents an architectural abstraction based on exception handling for structuring fault-tolerant software systems. This abstraction comprises several components and connectors that promote an existing untrusted software element into an idealised fault-tolerant architectural element. Moreover, it is considered in the context of a rigorous software development approach based on formal methods for representing the structure and behaviour of the software architecture. The proposed approach relies on a formal specification and verification for analysing exception propagation, and verifying important dependability properties, such as deadlock freedom, and scenarios of architectural reconfiguration. The formal models are automatically generated using model transformation from UML diagrams: component diagram representing the system structure, and sequence diagrams representing the system behaviour. Finally, the formal models are also used for generating unit and integration test cases that are used for assessing the correctness of the source code. The feasibility of the proposed architectural approach was evaluated on an embedded critical case study. Patrick Brito is supported by Fapesp/Brazil under Grant No. 06/02116–2 and CAPES/Brazil under Grant No. 0722–07–3. Cecília Rubira is partially supported by CNPq/Brazil under Grant Nos. 301446/2006–7 and 484138/2006–5.  相似文献   

2.

Vivid agents (Wagner, 1996) are software-controlled systems whose state comprises the mental components of knowledge, perceptions, tasks, and intentions, and whose behavior is represented by means of action and reaction rules. An execution model for vivid agents is presented, which is based on an architecture for concurrent action and planning (CAP). The concept of vivid agents and the CAP architecture in distributed diagnosis is evaluated, including fault-tolerant diagnosis and the diagnosis of an unreliable communication protocol.  相似文献   

3.
Developers of fault-tolerant distributed systems need to guarantee that fault tolerance mechanisms they build are in themselves reliable. Otherwise, these mechanisms might in the end negatively affect overall system dependability, thus defeating the purpose of introducing fault tolerance into the system. To achieve the desired levels of reliability, mechanisms for detecting and handling errors should be developed rigorously or formally. We present an approach to modeling and verifying fault-tolerant distributed systems that use exception handling as the main fault tolerance mechanism. In the proposed approach, a formal model is employed to specify the structure of a system in terms of cooperating participants that handle exceptions in a coordinated manner, and coordinated atomic actions serve as representatives of mechanisms for exception handling in concurrent systems. We validate the approach through two case studies: (i) a system responsible for managing a production cell, and (ii) a medical control system. In both systems, the proposed approach has helped us to uncover design faults in the form of implicit assumptions and omissions in the original specifications.  相似文献   

4.
This article introduces a new model-based method for incrementally constructing critical systems and illustrates its application to the development of fault-tolerant systems. The method relies on a special form of composition to combine software components and a set of proof rules to obtain high confidence of the correctness of the composed system. As in conventional component-based software development, two (or more) components are combined, but in contrast to many component-based approaches used in practice, which combine components consisting of code, our method combines components represented as state machine models. In the first phase of the method, a model is developed of the normal system behavior, and system properties are shown to hold in the model. In the second phase, a model of the required fault-handling behavior is developed and “or-composed” with the original system model to create a fault-tolerant extension which is, by construction, “fully faithful” (every execution possible in the normal system is possible in the fault-tolerant system). To model the fault-handling behavior, the set of states of the normal system model is extended through new state variables and new ranges for some existing state variables, and new fault-handling transitions are defined. Once constructed, the fault-tolerant extension is shown, using a set of property inheritance and compositional proof rules, to satisfy both the overall system properties, typically weakened, and selected fault-tolerance properties. These rules can often be used to verify the properties automatically. To provide a formal foundation for the method, formal notions of or-composition, partial refinement, fault-tolerant extension, and full faithfulness are introduced. To demonstrate and validate the method, we describe its application to a real-world, fault-tolerant avionics system.  相似文献   

5.
为了增强在构件组装过程中连接件的复用性,根据体系结构层的异常处理技术和理想容错构件的特点,提出了用四个基本连接件及其相互组合来实现构件间的连接,并将这一技术应用到理想容错构件结构中。相关的实践活动表明,通过四个基本连接件的组合可部分实现相关功能模块的功能,增强了连接件的复用,同时还方便了系统的维护和更新,增强了系统的可信性。  相似文献   

6.
当前并发程序容错机制处理方式单一、效率较低。为此,提出一种适用于多种并发程序错误处理的容错机制。通过在编译及运行过程中对程序进行异常处理,并在异常发生时根据设置的检查点对程序进行回滚和防错误处理,以实现并发程序容错。实验结果表明,该容错机制可有效检测并发程序中的错误,在不增加程序总体运行时间的情况下达到比较理想的容错效果。  相似文献   

7.
Model checking as a verification technique has proved effective at the system design and hardware level, and is now beginning to be applied to program code. In this paper, we study the application of model checking techniques in the development of Erlang systems. Erlang is a concurrent functional language with specific support for the development of distributed, fault-tolerant systems with soft real-time requirements. It was designed from the start to support a concurrency-oriented programming paradigm and large distributed implementations that this supports. The methodology we describe in this paper consists of abstracting the behaviour of Erlang and OTP components into a process algebraic specification, specifically an mCRL2 specification, upon which the standard model checker CADP can be used to verify the system’s properties. In addition to rules that model the Erlang syntax, a translation mechanism for the OTP modules gen_server, supervisor and gen_fsm, and the timeout event are defined. A tool-set etomcrl2 has been developed to automate the process of translation. A small illustrative example is used to evaluate the effectiveness of the proposed techniques, and its results show that the proposed techniques are effective in both verifying properties as well as distinguishing between correct and faulty implementations of the design.  相似文献   

8.
A GALS (Globally Asynchronous, Locally Synchronous) system consists of several synchronous components that evolve concurrently and interact with each other asynchronously. The design of GALS systems is tedious and error-prone due to the high degree of synchronous and asynchronous concurrency present in complex architectures. In this paper, we present GRL (GALS Representation Language), a formal language designed to model GALS systems, for the purpose of formal verification of the asynchronous aspects. GRL combines the synchronous reactive model underlying dataflow languages and the asynchronous concurrent model underlying process algebras. We propose a translation from GRL to LNT, a value-passing concurrent language with classical process algebra flavour. This makes possible the analysis of GRL specifications using all the state-of-the-art simulation and verification functionalities provided by the CADP toolbox.  相似文献   

9.
Exception handling enables programmers to specify the behavior of a program when an exceptional event occurs at runtime. Exception handling, thus, facilitates software fault tolerance and the production of reliable and robust software systems. With the recent emergence of multi-processor systems and parallel programming constructs, techniques are needed that provide exception handling support in these environments that are intuitive and easy to use. Unfortunately, extant semantics of exception handling for concurrent settings is significantly more complex to reason about than their serial counterparts.In this paper, we investigate a similarly intuitive semantics for exception handling for the future parallel programming construct in Java. Futures are used by programmers to identify potentially asynchronous computations and to introduce parallelism into sequential programs. The intent of futures is to provide some performance benefits through the use of method-level concurrency while maintaining as-if-serial semantics that novice programmers can easily understand — the semantics of a program with futures is the same as that for an equivalent serial version of the program. We extend this model to provide as-if-serial exception handling semantics. Using this model our runtime delivers exceptions to the same point it would deliver them if the program was executed sequentially. We present the design and implementation of our approach and evaluate its efficiency using an open source Java virtual machine.  相似文献   

10.
A fault-tolerant architectural approach for dependable systems   总被引:2,自引:0,他引:2  
A system's structure enables it to generate its intended behavior from its components' behavior. A well-structured system simplifies relationships among components, which can increase dependability. With software systems, the architecture is an abstraction of the structure. Architectural reasoning about dependability has become increasingly important because emerging applications are increasingly complex. We've developed an architectural approach for effectively representing and analyzing fault-tolerant software systems. The proposed solution relies on exception handling to tolerate faults associated with component and connector failures, architectural mismatches, and configuration faults. Our approach, a specialization of the peer-to-peer architectural style, hides inside the architectural elements the complexities of exception handling and propagation. Our goal is to improve a system's overall reliability and availability by making it tolerant of nonmalicious faults.  相似文献   

11.
This work presents an abstraction called guardian for exception handling in distributed and concurrent systems that use coordinated exception handling. This model addresses two fundamental problems with distributed exception handling in a group of asynchronous processes. The first is to perform recovery when multiple exceptions are concurrently signaled. The second is to determine the correct context in which a process should execute its exception handling actions. Several schemes have been proposed in the past to address these problems. These are based on structuring a distributed program as atomic actions based on conversations or transactions and resolving multiple concurrent exceptions into a single one. The guardian in a distributed program represents the abstraction of a global exception handler, which encapsulates rules for handling concurrent exceptions and directing each process to the semantically correct context for executing its recovery actions. Its programming primitives and the underlying distributed execution model are presented here. In contrast to the existing approaches, this model is more basic and can be used to implement or enhance the existing schemes. Using several examples we illustrate the capabilities of this model. Finally, its advantages and limitations are discussed in contrast to existing approaches.  相似文献   

12.
The state of art in handling and resolving concurrent exceptions is discussed and a brief outline of all research in this area is given. Our intention is to demonstrate that exception resolution is a very useful concept which facilitates joint forward error recovery in concurrent and distributed systems. To do this, several new arguments are considered. We understand resolution as reaching an agreement among cooperating participants of an atomic action. It is provided by the underlying system to make it unified and less error prone, which is important for forward error recovery, complex by nature. We classify atomic action schemes into asynchronous and synchronous ones and discuss exception handling for schemes of both kinds. The paper also deals with introducing atomic action schemes based on exception resolution into existing concurrent and distributed languages, which usually have only local exceptions. We outline the basic approach and demonstrate its applicability by showing how exception resolution can be used in Ada 83, Ada 95 (for both concurrent and distributed systems) and Java. A discussion of ways to make this concept more object-oriented and, with the help of reflection, more flexible and useful, concludes the paper.  相似文献   

13.
Modeling distributed computer systems is known to be a challenging enterprise. Typically, distributed systems are comprised of large numbers of components whose coordination may require complex interactions. Modeling such systems more often than not leads to the nominal intractability of the resulting state space. Various formal methods have been proposed to address the modeling of coordination among distributed systems components. For the most part, however, these methods do not support formal verification mechanisms. By way of contrast, the L-automata/L-processes model supports formal verification mechanisms which in many examples can successfully circumvent state space explosion problems, and allow verification proofs to be extended to an arbitrary number of components. After reviewing L-automata/L-processes formalisms, we present here the formal specification of a fault-tolerant algorithm for a distributed computer system. We also expose the L-automata/L-processes verification of the distributed system, demonstrating how various techniques such as homomorphic reduction, induction, and linearization, can be used to overcome various problems which surface as one models large, complex systems.  相似文献   

14.
Fault-tolerant systems have found wide applications in military,industrial and commercial areas.Most of these systems are constructed by multiple-modular redundancy or error control coding techniques,They need some fault-tolerant specific components (such as voter,switcher,encoder,or decoder) to implement error-detecting or error-correcting functions.However, the problem of error detection location or correction for fault-tolerance specific components them-selves has not been solved properly so far.Thus ,the dependability of a whole fault-tolerant system will be greatly affected.This paper presents a theory of robust fault-masking digital circuits for characterizing fault-tolerant systems with the ability of concurrent error location and a new scheme of dual-modular redundant systems with partially robust fault-masking prperty.A Basic robust fault-masking circuit is composed of a basic functional circuit and an error-locting corrector,Such a circuit not only has the ability of concurrent error correction,but also has the ability of concurrent error location.According to this circuit model ,for a partially robust fault-making dual-modular redundant system,two redundant modules based on alternating-complementary logic consist of the basic functional circuit.An error-correction specific circuit named as alternating-complementary corrector is used as the error-locating corrector.The performance(such as hardware complexity, time delay) of the scheme is analyzed.  相似文献   

15.
A system is fault tolerant if it remains functional after the occurrence of a fault. Given a plant subject to a fault, fault-tolerant control requires the controller to form a fault-tolerant closed-loop system. For the systematic design of a fault-tolerant controller, typical input data consists of the plant dynamics including the effect of the faults under consideration and a formal performance requirement with a possible allowance for degraded performance after the fault. For its obvious practical relevance, the synthesis of fault-tolerant controllers has received extensive attention in the literature, however, with a particular focus on continuous-variable systems. The present paper addresses discrete-event systems and provides an overview on fault-tolerant supervisory control. The discussion is held in terms of formal languages to uniformly present approaches to passive fault-tolerance, active fault-tolerance, post-fault recovery and fault hiding.  相似文献   

16.
Currently available application frameworks that target the automatic design of real-time embedded software are poor in integrating functional and non-functional requirements for mobile and ubiquitous systems. In this work, we present the internal architecture and design flow of a newly proposed framework called Verifiable Embedded Real-Time Application Framework (VERTAF), which integrates three techniques namely software component-based reuse, formal synthesis, and formal verification. Component reuse is based on a formal unified modeling language (UML) real-time embedded object model. Formal synthesis employs quasi-static and quasi-dynamic scheduling with multi-layer portable efficient code generation, which can output either real-time operating systems (RTOS)-specific application code or automatically generated real-time executive with application code. Formal verification integrates a model checker kernel from state graph manipulators (SGM), by adapting it for embedded software. The proposed architecture for VERTAF is component-based which allows plug-and-play for the scheduler and the verifier. The architecture is also easily extensible because reusable hardware and software design components can be added. Application examples developed using VERTAF demonstrate significantly reduced relative design effort as compared to design without VERTAF, which also shows how high-level reuse of software components combined with automatic synthesis and verification increases design productivity.  相似文献   

17.
We present the UMC framework for the formal analysis of concurrent systems specified by collections of UML state machines. The formal model of a system is given by a doubly labelled transition system, and the logic used to specify its properties is the state-based and event-based logic UCTL. UMC is an on-the-fly analysis framework which allows the user to interactively explore a UML model, to visualize abstract behavioural slices of it and to perform local model checking of UCTL formulae. An automotive scenario from the service-oriented computing (SOC) domain is used as case study to illustrate our approach.  相似文献   

18.
With nowadays popularity of large-scale parallel computers, Multiprocessors System-on-Chip (MP-SoCs), multicomputers, cluster computers and peer-to-peer communication networks, fault-tolerant routing becomes an important issue in developing these systems. Fault-tolerant routing algorithms in such systems aim at providing continuous operations in the presence of one or more failures by allowing the graceful degradation of system. The Software-Based fault-tolerant routing scheme has been suggested as an efficient routing algorithm to preserve both communication performance and fault-tolerant demands in parallel computer systems. To study network performance, a number of different analytical models for fault-free routing algorithms have been proposed in the past literature. However, there has not been reported any similar analytical model of fault-tolerant routing in the presence of faulty components. This paper presents a new analytical modeling approach for determining the effects of failures in wormhole-switched 2-D tori using the fault-tolerant Software-Based scheme. More specifically, we describe a general model to derive mathematical expressions to investigate the performance behavior of routing algorithms confronting convex (|-shaped, □-shaped) or concave (U-shaped, +-shaped, T-shaped, H-shaped) faulty regions. The model is validated through comprehensive simulation experiments for different types of failures.
M. Ould-KhaouaEmail:
  相似文献   

19.
One shortcoming with most AI planning systems has been an inability to deal with execution-time discrepancies between actual and expected situations. Often, these exception situations jeopardize the immediate integrity and safety of the planning agent or its surroundings, with the only recourse being more time-consuming plan generation. In order to avoid such situations, potential exceptions must be predicted during plan execution. Since many application domains (particularly for autonomous systems) are inherently dynamic — in the sense that information is at best incomplete, perhaps erroneous, and changes over time independent of a planning agent's actions — managing action in the world becomes a difficult problem. Action and events in dynamic worlds must be monitored in order to coordinate an agent's actions with its surroundings. This allows the agent to predict and plan for potential future exception situations while acting in the present.This paper introduces an approach to autonomous reaction in dynamic environments. We have avoided the traditional distinction between generating and then executing plans through the use of a dynamic reaction system, which handles potential exception situations gracefully as it carries out assigned tasks. The reaction system manages constraints imposed by ongoing activity in the world, as well as those derived from long-term planning, to control observable behaviour. This approach provides the necessary stimulus/response behaviour required in dynamic situations, while using goal-directed constraints as heuristics for improved reactions.We present an overview of the salient features of dynamic worlds and their impact on traditional planning, introduce our model of dynamic reactivity, describe an implementation of the model and its performance in a dynamic simulation environment, and present an architecture incorporating long-term planning with short-term reactance suitable for autonomous systems applications.  相似文献   

20.
A service-based architecture for dynamically reconfigurable workflows   总被引:2,自引:0,他引:2  
In the last few years, business process management systems have been employed for handling information systems of ever increasing complexity. As a consequence, the adoption of modelling languages enabling smooth and seamless transitions among the various phases of the process lifecycle, the ability of exploiting coordination schema over distributed execution contexts and the support for dynamic evolution and reconfiguration have become software engineering issues of great importance. This paper proposes the use of PN-Engine, a decentralized Petri nets execution engine, as a business process enactment engine. PN-Engine, which is based on the Jini service architecture, supports the decentralized execution of process models specified as Petri nets (PNs) enhanced with modular constructs and offers suitable mechanisms for dealing with the aforementioned design issues. PN-Engine allows to deploy and enact a new version of an existing process model without requiring the stopping/removal of older instances that are still running. The paper presents a novel approach enabling a decentralized migration procedure where concurrent portions of older instances migrate asynchronously to the new process model. Advantages of the proposed approach are demonstrated by means of an example concerning a workflow for a wine-production process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号