首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 499 毫秒
1.

Fault tree analysis is still widely practiced in high-hazard industries. We propose in this article an algorithm for the reduction of fault tree expressions that are generated from automata representations of failure behaviors. Automata formalisms are increasingly being used to describe systems exhibiting sequence-dependent failures—i.e., the overall outcome like a total failure of the system can depend on the order in which events occur. A set of paths leading to a safety-relevant state is encoded as a standard sum of product canonical form, and without any loss of the significance of the sequencing of events. That is, the corresponding fault tree expression is basically a Boolean formula which is extended with the necessary temporal features (event occurrence priority). Such expressions can then be reduced into minimal canonical forms by using the Boolean methods together with the required temporal logic calculus. Since minimal failure sequences can be determined from the obtained reduced models, the proposed approach can improve the analysis of the dynamic effects of the sequencing of faults and propagated errors in such models. As a consequence, it can have a positive impact on the design of failure prevention measures. A fault tolerant example system exhibiting dynamic behavior is used to highlight the benefits of the approach.

  相似文献   

2.
Fault tree analysis with fuzzy gates   总被引:4,自引:0,他引:4  
Fault tree analysis is an important tool analyzing system reliability. Fault trees consist of gates and events. Gates mean relationships between events. In fault tree analysis, AND, OR gates have been used as typical gates but it is often difficult to model the system structure with the two gates because in many cases we have not exact knowledge on system failure mechanism in early design stage. In this paper, we apply the fuzzy sets theory to modeling the fuzzy system structure, propose the new procedure to calculate the system reliability and a new importance index of basic events.  相似文献   

3.
A simple method to derive minimal cut sets for a non-coherent fault tree   总被引:3,自引:0,他引:3  
Minimal cut sets (or prime implicants: minimal combinations of basic event conditions leading to system failure) are important information for reliability/safety analysis and design. To obtain minimal cut sets for general non-coherent fault trees, including negative basic events or multi-valued basic events, a special procedure such as the consensus rule must be applied to the results obtained by logical operations for coherent fault trees, which will require more steps and time. This paper proposes a simple method for a non-coherent fault tree, whose top event is represented as an AND combination of monotonic sub-trees. A "monotonic" sub-tree means that it does not have both positive and negative representations for each basic event. It is proven that minimal cut sets can be obtained by a conventional method for coherent fault trees. An illustrative example of a simple event tree analysis shows the detail and characteristics of the proposed method.  相似文献   

4.
Fault Tree Analysis (FTA) is a well-established and well-understood technique, widely used for dependability evaluation of a wide range of systems. Although many extensions of fault trees have been proposed, they suffer from a variety of shortcomings. In particular, even where software tool support exists, these analyses require a lot of manual effort. Over the past two decades, research has focused on simplifying dependability analysis by looking at how we can synthesise dependability information from system models automatically. This has led to the field of model-based dependability analysis (MBDA). Different tools and techniques have been developed as part of MBDA to automate the generation of dependability analysis artefacts such as fault trees. Firstly, this paper reviews the standard fault tree with its limitations. Secondly, different extensions of standard fault trees are reviewed. Thirdly, this paper reviews a number of prominent MBDA techniques where fault trees are used as a means for system dependability analysis and provides an insight into their working mechanism, applicability, strengths and challenges. Finally, the future outlook for MBDA is outlined, which includes the prospect of developing expert and intelligent systems for dependability analysis of complex open systems under the conditions of uncertainty.  相似文献   

5.
Validation of the dependability of distributed systems via fault injection is gaining importance because distributed systems are being increasingly used in environments with high dependability requirements. The fact that distributed systems can fail in subtle ways that depend on the state of multiple parts of the system suggests that a global-state-based fault injection mechanism should be used to validate them. However, global-state-based fault injection is challenging since it is very difficult in practice to maintain the global state of a distributed system at runtime with minimal intrusion into the system execution. We present Loki, a global-state-based fault injector, which has been designed with the goals of low intrusion, high precision, and high flexibility. Loki achieves these goals by utilizing the ideas of partial view of global state, optimistic synchronization, and offline analysis. In Loki, faults are injected based on a partial, view of the global state of the system, and a post-runtime analysis is performed to place events and injections into a single global timeline and to discard experiments with incorrect fault injections. Finally, the experiments with correct fault injections are used to estimate user-specified performance and dependability measures. A flexible measure language has been designed that facilitates the specification of a wide range of measures.  相似文献   

6.
The importance of assessing software non-functional properties (NFP) beside the functional ones is well accepted in the software engineering community. In particular, dependability is a NFP that should be assessed early in the software life-cycle by evaluating the system behaviour under different fault assumptions. Dependability-specific modeling and analysis techniques include for example Failure Mode and Effect Analysis for qualitative evaluation, stochastic Petri nets for quantitative evaluation, and fault trees for both forms of evaluation. Unified Modeling Language (UML) may be specialized for different domains by using the profile mechanism. For example, the MARTE profile extends UML with concepts for modeling and quantitative analysis of real-time and embedded systems (more specifically, for schedulability and performance analysis). This paper proposes to add to MARTE a profile for dependability analysis and modeling (DAM). A case study of an intrusion-tolerant message service will offer insight on how the MARTE-DAM profile can be used to derive a stochastic Petri net model for performance and dependability assessment.  相似文献   

7.
The aim of this paper is to assess the reliability of a fault detection and isolation scheme (FDI) and the impact of sensor’s failure probability on such scheme. The proposed method is based on a graph-theoretic approach and assumes only the knowledge of the system’s structure. For a structured linear system (SLS), we first recall the fault diagnosis conditions when using an observer based scheme. Then, we deduce the sets of sensors that ensure the validity of such conditions. Next, we proceed with a reliability analysis of this kind of diagnosability property based on the sensors’ reliability. Through the assessment of an importance factor, we propose a simple maintenance strategy to maintain the level of the property reliability. The contribution concerns the combination of dependability and structural analysis for studying system’s properties.  相似文献   

8.
Probabilistic model checking has been used recently to assess, among others, dependability measures for a variety of systems. However, the numerical methods employed, such as those supported by model checking tools such as PRISM and MRMC, suffer from the state-space explosion problem. The main alternative is statistical model checking, which uses standard Monte Carlo simulation, but this performs poorly when small probabilities need to be estimated. Therefore, we propose a method based on importance sampling to speed up the simulation process in cases where the failure probabilities are small due to the high speed of the system’s repair units. This setting arises naturally in Markovian models of highly dependable systems. We show that our method compares favourably to standard simulation, to existing importance sampling techniques, and to the numerical techniques of PRISM.  相似文献   

9.
In the hot-standby replication system, the system cannot process its tasks anymore when all replicated nodes have failed. Thus, the remaining living nodes should be well-protected against failure when parts of replicated nodes have failed. Design faults and system-specific weaknesses may cause chain reactions of common faults on identical replicated nodes in replication systems. These can be alleviated by replicating diverse hardware and software. Going one-step forward, failures on the remaining nodes can be suppressed by predicting and preventing the same fault when it has occurred on a replicated node. In this paper, we propose a fault avoidance scheme which increases system dependability by avoiding common faults on remaining nodes when parts of nodes fail, and analyze the system dependability.  相似文献   

10.
In this paper we describe the process of a multi-disciplinary medical team meeting (MDTM), its functions and operation in colocated and teleconference discussions. Our goal is to identify the elements and mechanics of operation that enhance or threaten the dependability of the MDTM as a “system” and propose technologies and measures to make this system more reliable. In particular, we assess the effect of adding teleconferencing to the MDTM, and identify strengths and vulnerabilities introduced into the system by the addition of teleconferencing technology. We show that, with respect to the system’s external task environment, rhythms of execution of pre-meeting and post-meeting activities are critical for MDTM success and that the extension of the MDTM to wider geographic locations with teleconferencing might disrupt such rhythms thereby posing potential threats to dependability. On the other hand, an analysis of vocalisation patterns demonstrates that despite difficulties related to coordination and awareness in video-mediated communication (evidenced by increased time spent in case discussion, longer turns, decreased turn frequency and near lack of informal exchanges) the overall case discussion structure is unaffected by the addition of teleconferencing technology into proceedings.  相似文献   

11.
In this paper, we propose and evaluate a framework for fault tolerant workflow execution in Grid environments. Different from previous work in the literature, our system dynamically chooses an appropriate fault tolerance technique while using a user-defined rule-based system. We also provide a generic interface that can be used to add fault tolerance techniques to the framework. The results obtained with real workflows in an experimental Grid environment show that the overhead introduced by our framework in a failure-free execution is, in the worst evaluated case, approximately 10 %. Moreover, we show that, using our framework, workflows are able to execute successfully in the presence of failures and that the framework can dynamically choose an appropriate fault tolerance technique. The main contributions of our work are twofold: the developed framework and the model-based dependability analysis we performed on it. The purpose in carrying out a model-based dependability analysis consists on evaluating the interaction between our framework and the distributed Grid environment beyond the physical limitations of an empirical evaluation. By doing this, we provide means to plan the assurance of QoS in the Grid resource allocation, while applying the fault-tolerance mechanisms we implement in our framework regardless of the underlying middleware.  相似文献   

12.
一种新的故障树定性分析方法   总被引:4,自引:1,他引:3       下载免费PDF全文
提出基于割序集的分析方法以研究故障树顶事件发生时基本事件的动态行为。利用顺序失效符表示事件的顺序失效关系,并将静态门和动态门转化为顺序失效表达式来描述故障树中各种门的动态行为,利用顺序失效表达式构建故障树的割序集。结合实例阐述故障树割序集生成算法的流程。该算法将失效行为表示为长度小于系统中部件个数的有序部件序列,为研究故障树提供了一种新的定性分析方法。  相似文献   

13.
安全关键系统高可信保障技术的研究   总被引:5,自引:0,他引:5  
1 引言安全关键系统SCS(Safety Critical Systems)是指系统功能一旦失效将引起生命、财产的重大损失以及环境可能遭到严重破坏的系统。这类系统广泛存在于航空航天、国防、交通运输、核电能源和医疗卫生等诸多安全关键领域中。而高可信(Ultradependability)则是指系统在任务开始时可用性给定的情况下,在规定的时间和环境内能够使用且能完成规定功能的能力,即系统“动则成功”的能力。随着现代社会的高速发展及不稳定因素的存在,安全关键系统日益庞大和复杂,带来了系统可靠性和安全性的下降、投资增加、研发周期加长、风险增加。安全关键系统的应用环境也更加复杂和恶劣,从陆地、海洋到天空、太空,安全关键系统的使用环境不断地扩展和更加严酷。严酷的环境对系统高可靠、高安全性等综合特性的实现提出了严峻的挑战。除此,系统要求的持续无故障任务  相似文献   

14.
本文简述了案例推理和故障树诊断两种方法,提出将它们进行融合诊断的思想。一方面,将案例推理得到的、但在故障树中却不存在的底事件加入到故障树,完善了故障覆盖面;另一方面,由案例库统计得到的故障树最小割集重要度,使故障定位更精确,由故障树分析法完成的成功的诊断实例,再添加到案例库中。  相似文献   

15.
A dependability model for TMR system   总被引:1,自引:0,他引:1  
Much research has been done on the dependability evaluation of computer systems. However, much of this is gone no further than study of the fault coverage of such systems, with little focus on the relationship between fault coverage and overall system dependability. In this paper, a Markovian dependability model for triple-modular-redundancy (TMR) system is presented. Having fully considered the effects of fault coverage, working time, and constant failure rate of single module on the dependability of the target TMR system, the model is built based on the stepwise degradation strategy. Through the model, the relationship between the fault coverage and the dependability of the system is determined. What is more, the dependability of the system can be dynamically and precisely predicted at any given time with the fault coverage set. This will be of much benefit for the dependability evaluation and improvement, and be helpful for the system design and maintenance.  相似文献   

16.
故障注入技术在BIT软件测试中是一种有效的手段。针对电路板级BIT软件测试中遇到的问题,介绍了一种基于开源模拟器QEMU实现的处理器类故障模拟方法。采用该方法对多种处理器故障进行仿真建模,通过对QEMU的扩展开发,加入故障行为模拟模块和故障注入模块,以实现一个具有处理器类故障注入功能的系统级模拟器BitVaSim。首先分析处理器功能故障模式,提取故障的关键字值对,用XML Schema定义故障并用于故障建模;其次对QEMU代码进行二次开发以实现对处理器故障行为的模拟;然后通过配置故障注入接口实现模拟器运行时的故障模式匹配、故障按条件触发等功能;最后通过实验案例来观察模拟器的故障表现,评价这种基于模拟器的故障注入技术。实验过程和结果显示这种方法是有效可行的。  相似文献   

17.
基于最小割集的安全性测试用例的动态生成   总被引:1,自引:0,他引:1  
利用故障树的原理和方法,对基于故障树最小割集的安全性测试用例动态生成进行了研究.首先阐述了故障树和故障树最小割集的概念及数学描述,然后给出了故障树最小割集的生成算法,最后在此基础上提出了基于故障树最小割集的动态生成安全性测试用例的算法.  相似文献   

18.
We present in this paper a study on fault management in a grid middleware. The middleware is our home-grown software called P2P-MPI. This framework is MPJ compliant, allows users to execute message passing parallel programs, and its objective is to support environments using commodity hardware. Hence, running programs is failure prone and a particular attention must be paid to fault management. The fault management covers two issues: fault-tolerance and fault detection. Fault-tolerance deals with the program execution: P2P-MPI provides a transparent fault tolerance facility based on replication of computations. Fault detection concerns the monitoring of the program execution by the system. The monitoring is done through a distributed set of modules called failure detectors. The contribution of this paper is twofold. The first contribution is the evaluation of the failure probability of an application depending on the replication degree. The failure probability depends on the execution length, and we propose a model to evaluate the duration of a replicated parallel program. Then, we give an expression of the replication degree required to keep the failure probability of an execution under a given threshold. The second contribution is a study of the advantages and drawbacks of several fault detection systems found in the literature. The criteria of our evaluation are the reliability of the failure detection service and the failure detection speed. We retain the binary round-robin protocol for its failure detection speed, and we propose a variant of this protocol which is more reliable than the application execution in any case. Experiments involving of up to 256 processes, carried out on Grid’5000, show that the real detection times closely match the predictions.  相似文献   

19.
基于FTA的煤矿瓦斯事故分析   总被引:1,自引:0,他引:1  
给出了事故树分析的步骤,绘制出了煤矿瓦斯事故树图,运用事故树分析方法对瓦斯事故进行了逻辑分析;采用布尔代数法,找出了煤矿瓦斯爆炸事故发生的最小割集、最小径集;通过对最小割集(最小径集)的求解,确定了基本事件的结构重要度,从而了解了煤矿的危险程度和安全程度,掌握了导致瓦斯事故发生的各基本原因事件的组合关系及其重要程度;给出了防治瓦斯事故发生的措施。  相似文献   

20.
一种网格环境下的动态故障检测算法   总被引:6,自引:0,他引:6  
针对现有网格系统出错几率较大、已有故障检测算法不能有效满足网格系统需求问题,提出了一种网格环境下的动态故障检测算法.根据网格系统的特点,基于不可靠故障检测思想,建立了网格系统模型和故障检测模型;结合心跳(heartbeat)策略和灰色预测方法,设计了一种动态心跳机制,并给出了预测模型和实时预测策略;提出了基于该动态心跳机制的网格故障检测算法,分析了算法的可靠性.仿真实验结果表明,该算法是正确、有效的,可用于网格环境下的故障检测.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号