首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper presents a quantitative reliability analysis of a system designed to tolerate both hardware and software faults. The system achieves integrated fault tolerance by implementing N-version programming (NVP) on redundant hardware. The system analysis considers unrelated software faults, related software faults, transient hardware faults, permanent hardware faults, and imperfect coverage. The overall model is Markov in which the states of the Markov chain represent the long-term evolution of the system-structure. For each operational configuration, a fault-tree model captures the effects of software faults and transient hardware faults on the task computation. The software fault model is parameterized using experimental data associated with a recent implementation of an NVP system using the current design paradigm. The hardware model is parameterized by considering typical failure rates associated with hardware faults and coverage parameters. The authors results show that it is important to consider both hardware and software faults in the reliability analysis of an NVP system, since these estimates vary with time. Moreover, the function for error detection and recovery is extremely important to fault-tolerant software. Several orders of magnitude reduction in system unreliability can be observed if this function is provided promptly  相似文献   

2.
A memory array reliability model is developed that can be applied to a wide range of memory organizations including random-access memories (RAM) and read-only memories (ROM). The model is particularly useful for computing the reliability of fault-tolerant memories that employ techniques such as hardware redundancy, error-correcting codes, and software error-correcting algorithms. The model accommodates the effect of faults masked by data. Reliability models that incorporate the array model are given for a simplex RAM, an N-modular-redundant RAM, a spared RAM, a single-error-correcting RAM, a multiple-error-correcting RAM, and a ROM. Reliability characteristics of these memories are compared. The results suggest that memories with error-correcting capability and spare bit-planes provide the best reliability. Memories with sparing at the array level are next best followed by NMR and simplex organizations. ROM reliability is shown to be more optimistic when masked faults are considered.  相似文献   

3.
空管系统中网络数据记录仪的可靠性设计   总被引:1,自引:0,他引:1  
针对现有空管系统中使用的货架记录仪的特点和不足,自主研制了一套符合空管系统要求的网络数据记录仪。系统采用了硬件冗余设计和软件容错设计等技术,实现了系统的高可靠性,结果证明,该设备在空管系统中运行稳定、可靠。  相似文献   

4.
Two tightly coupled multi-computer testbeds, one providing efficient inter-node communications tailored to the application, and the other providing more flexible full connectivity among processors and memories are used to support validation of the design techniques for distributed real-time systems. The testbeds are valuable tools for evaluating, analyzing, and studying the behavior of many algorithms for distributed systems. We have used the testbeds in studying distributed recovery block scheme for handling hardware and software faults. A testbed has also been used to analyze database locking techniques and a fault-tolerant locking protocol for recovery from faults that occur during updating of replicated copies of files in tightly coupled distributed systems. Testbeds can be configured to represent the operating environments and input scenarios more accurately than software simulation. Therefore, testbed-based evaluation provides more accurate results than simulation and yields greater insight into the characteristics and limitations of proposed concepts. This is an important advantage in the complex field of distributed real-time system design evaluation and validation. Therefore, testbed-based experimentation is an effective approach to validate system concepts and design techniques for distributed systems for real-time applications.  相似文献   

5.
The testability of majority voting based fault-tolerant circuits is investigated and sufficient conditions for constructing circuits that are testable for all single and multiple stuck-at faults are established. The testability conditions apply to both combinational and sequential logic circuits and result in testable majority voting based fault-tolerant circuits without additional testability circuitry. Alternatively, the testability conditions facilitate the application of structured design for testability and Built-In Self-Test techniques to fault-tolerant circuits in a systematic manner. The complexity of the fault-tolerant circuit, when compared to the original circuit can significantly increase test pattern generation time when using traditional automatic test pattern generation software. Therefore, two test pattern generation algorithms are developed for detecting all single and multiple stuck-at faults in majority voting based circuits designed to satisfy the testability conditions. The algorithms are based on hierarchical test pattern generation using test patterns for the original, non-fault-tolerant circuit and structural knowledge of the majority voting based design. Efficiency is demonstrated in terms of test pattern generation time and cardinality of the resulting set of test patterns when compared to traditional automatic test pattern generation software.  相似文献   

6.
The reliability and maintainability design criteria that were a part of large central-control communication systems, have been combined to produce deferred maintenance concepts in fully distributed communication systems. Combining these concepts allows the achievement of a cost-effective life-cycle design for communication switching systems. In a hardware/software environment one should not separate hardware maintainability and software maintainability, nor separate the reliability and maintainability of the system. Reliability models are being developed that reflect a constant rate for transient faults and a decreasing rate for catastrophic faults. The relationship of software bugs to their number and type of manifestation is being defined. Designs for primarily non-attended system sites should include appropriate maintenance concepts in order to be cost effective. If two or more individual repairs can be made with each maintenance visit to a site, the total area maintenance staff can be reduced. Implementation of deferred maintenance techniques can raise the availability level of the services provided, especially for a distributed communication switching system. Several examples of practical techniques for developing deferred maintenance concepts are presented, and topics such as the manning versus non-manning of sites, the time of day effects on state diagrams, centralized maintenance, and computer modeling techniques are discussed. The current and potential maintainability concepts and analysis tools that are discussed in this paper can be used to develop cost-effective maintenance concepts as distributed systems become more prevalent.  相似文献   

7.
This paper presents a NHPP-based SRGM (software reliability growth model) for NVP (N-version programming) systems (NVP-SRGM) based on the NHPP (nonhomogeneous Poisson process). Although many papers have been devoted to modeling NVP-system reliability, most of them consider only the stable reliability, i.e., they do not consider the reliability growth in NVP systems due to continuous removal of faults from software versions. The model in this paper is the first reliability-growth model for NVP systems which considers the error-introduction rate and the error-removal efficiency. During testing and debugging, when a software fault is found, a debugging effort is devoted to remove this fault. Due to the high complexity of the software, this fault might not be successfully removed, and new faults might be introduced into the software. By applying a generalized NHPP model into the NVP system, a new NVP-SRGM is established, in which the multi-version coincident failures are well modeled. A simplified software control logic for a water-reservoir control system illustrates how to apply this new software reliability model. The s-confidence bounds are provided for system-reliability estimation. This software reliability model can be used to evaluate the reliability and to predict the performance of NVP systems. More application is needed to validate fully the proposed NVP-SRGM for quantifying the reliability of fault-tolerant software systems in a general industrial setting. As the first model of its kind in NVP reliability-growth modeling, the proposed NVP SRGM can be used to overcome the shortcomings of the independent reliability model. It predicts the system reliability more accurately than the independent model and can be used to help determine when to stop testing, which is a key question in the testing and debugging phase of the NVP system-development life cycle  相似文献   

8.
Hardware redundancy has been used in the design of fault-tolerant digital systems. A synthesis of protective hardware redundancy techniques is proposed and a generalized reliability model suitable for many fault-tolerant configurations is developed. This model, to be called General Modular Redundancy (GMR), yields as particular cases several known models of redundant structures.  相似文献   

9.
In the recent years both software and hardware techniques have been adopted to carry out reliable designs, aimed at autonomously detecting the occurrence of faults, to allow discarding erroneous data and possibly performing the recovery of the system. The aim of this paper is the introduction of a combined use of software and hardware approaches to achieve a complete fault coverage in generic IP processors, with respect to SEU faults. Software techniques are preferably adopted to reduce the necessity and costs of modifying the processor architecture; since a complete fault coverage cannot be achieved, partial hardware redundancy techniques are then introduced to deal with the remaining, not covered, faults. The paper presents the methodological approach adopted to achieve the complete fault coverage, the proposed resulting architecture, and the experimental results gathered from the analysis of the fault injection campaigns.  相似文献   

10.
The reliability of a fault-tolerant circuit may be drastically impaired by the presence of maskable faults that never affect its functionality. Design for testability (DFT) techniques have to be applied to make maskable faults detectable. During the testing phase, traditional DFT schemes inhibit fault masking and/or activate additional observation/control paths through the circuit. Such schemes, however, do not enable on-line testing and cannot be applied to multilevel fault-tolerant circuits, where fault-masking is repeatedly performed inside the circuit. We propose a new approach to the design of testable fault-tolerant CMOS circuits that overcomes both limitations. Our approach is based on the use of IDDQ-checkable voters (ICVs) that enable a complete test of maskable faults of any multiplicity during normal operations  相似文献   

11.
This paper describes an experimental tool to evaluate and support the development of fault-tolerant machines designed for aerospace motor drives. Aerospace applications involve essentially safety-critical systems which should be able to overcome hardware or software faults and therefore need to be fault tolerant. A way of achieving this is to introduce variable degrees of redundancy into the system by duplicating one or all of the operations within the system itself. Looking at motor drives, multiphase machines, such as multiphase brushless dc machines, are considered to be good candidates in the design of fault-tolerant aerospace motor drives. This paper introduces a multiphase two-level inverter using a flexible and reliable field-programmable gate-array/digital-signal-processor controller for data acquisition, motor control, and fault monitoring to study the fault tolerance of such systems.   相似文献   

12.
A survey of associative processing, techniques is presented, together with a guide to the published literature in this field. Some familiarity with the basic concepts of associative-processing is assumed. The references have been divided into four groups dealing with architectural concepts, hardware implementation, software considerations, and application areas. The discussion of architectural concepts consists of a classification of associative devices into four major categories (fully parallel, bit-serial, word-serial, and block-oriented) and an enumeration of techniques for dealing with multiple responses and hardware faults. With respect to hardware implementation, considerations are given to the basic operations implemented, hardware elements used (e.g., cryoelectrics, magnetic elements, and semiconductors), and physical characteristics such as speed, size, and cost. The discussion of software aspects of associative devices deals with synthesis of algorithms, programming problems, and software simulation. The application areas discussed include solution of some mathematical systems, radar signal processing, information storage and retrieval, and performance of certain control functions in computer systems.  相似文献   

13.
This paper addresses the information flow between devices and programs in computer integrated manufacturing systems. Specifically, it presents modeling techniques and methods for detecting the existence of message paths among hardware and software components and the upper bound on time delays along that message path. The modeling technique can be used to analyze interoperability between hardware and software components in the system in initial design and specification. The modeling technique has three components: an object model to describe the message passing protocols between communicating components; a color timed Petri net to describe the dynamic behavior and state dependency within each individual component; and an object synthesis method that integrates the Petri nets of individual objects and message protocols between objects to describe the dynamics of the entire system. The graphical modeling can enhance communication among different groups involved in system design and the analytical method can provide component specifications. The use of the modeling technique and method in early system design can result in time and cost savings in system integration due to better communication, better component selection and early problem identification  相似文献   

14.
The paper presents an evolutionary approach to the design of fault-tolerant VLSI (very large scale integrated) circuits using EHW (evolvable hardware). The EHW research area comprises a set of applications where GA (genetic algorithms) are used for the automatic synthesis and adaptation of electronic circuits. EHW is particularly suitable for applications requiring changes in task requirements and in the environment or faults, through its ability to reconfigure the hardware structure dynamically and autonomously. This capacity for adaptation is achieved via the use of GA search techniques, in our experiments, a fine-grained CMOS (complementary metal-oxide silicon) FPTA (field-programmable FPGA transistor array) architecture is used to synthesize electronic circuits. The FPTA is a reconfigurable architecture, programmable at the transistor level and specifically designed for EHW applications. The paper demonstrates the power of EA to design analog and digital fault-tolerant circuits. It compares two methods to achieve fault-tolerant design, one based on fitness definition and the other based on population. The fitness approach defines, explicitly, the faults that the component can encounter during its life, and evaluates the average behavior of the individuals. The population approach, on the other hand, uses the implicit information of the population statistics accumulated by the GA over many generations. The paper presents experiment results obtained using both approaches for the synthesis of a fault-tolerant digital circuit (XNOR) and a fault-tolerant analog circuit (multiplier)  相似文献   

15.
Real-time systems are an important class of process control systems that need to respond to events under time constraints, or deadlines. Such systems may also be required to deliver service in spite of hardware or software faults in their components. This fault-tolerant characteristic is especially critical in systems whose failure can cause economic disaster and/or loss of lives. This paper reports recent research in the area of analytical modeling of the three major characteristics of real-time systems: timeliness, dependability, and external environmental dependencies. The paper starts with a brief introduction to analytical modeling frameworks such as Markov models and stochastic petri nets. This is followed by an examination of advances in modeling response-time distributions, reliability, distributed messaging services, and software fault-tolerance in real-time systems.  相似文献   

16.
This paper uses a single model to analyze the effects of both hardware and software on system reliability. A unified model of hardware and software reliability is developed using Markov modeling. Then the effect of hardware and software failures is studied using the model. The model incorporates concepts from both hardware and software reliability modeling. Examples of both simplex (nonredundant) and redundant architectures are analyzed using the model  相似文献   

17.
For a state-space time-delay system with linearly coupled input and output disturbances, a simultaneous state and disturbance estimation technique is developed. For a nonlinear state-space time-delay system with dependent input and output disturbances, a nonlinear estimator is also proposed to estimate system state and disturbance at the same time. The proposed estimator techniques are applied next to estimate system state and fault signal. Via actuator and/or sensor signal compensation, a simple and efficient fault-tolerant operation can be realized. In the developed design, no limitations and prior knowledge are required on the considered faults. Moreover, identical actuator and/or sensor switches and control gain reconstruction are not necessary. Therefore, the proposed estimation and fault-tolerant scheme is economical and convenient in practical applications. After that, the design techniques are extended to the case of systems with a class of uncoupled input and output faults. Examples and simulations given show excellent signal estimation and fault-tolerant performance.  相似文献   

18.
Programs for calculating the reliability of fault-tolerant systems do not explicitly take into account the effect of faults in the hardware recovery mechanism. This paper shows via an example how to incorporate these failures into the fault-handling (coverage) model of CARE III. A simple fault-tolerant system is described. The required coverage parameters are determined and the reliability is calculated using the models in CARE III.  相似文献   

19.
李旭刚 《电子测试》2010,(9):6-9,46
本文主要展示了集成电路测试系统软件的基本功能及其核心部分的实现方法。先介绍了集成电路测试系统硬件的建模方法,然后在硬件建模的基础上讲述了集成电路软件的结构及设计方法。着重介绍了流程图测试程序转化为文本测试程序的方法以及对文本测试程序的编译方法。用户只需要根据测试流程编写流程图即可以实现测试程序的编写,实现对IC的测试,提高了开发效率。该软件系统具有和硬件系统低耦合性,当向集成电路测试系统添加删改硬件时,不需要修改太多的软件内容即可以正常工作,通用性较好。  相似文献   

20.
《Microelectronics Reliability》2006,46(9-11):1421-1432
The topic of this paper is systems that need be designed such that no single fault can cause failure at the overall level. A methodology is presented for analysis and design of fault-tolerant architectures, where diagnosis and autonomous reconfiguration can replace high cost triple redundancy solutions and still meet strict requirements to functional safety. The paper applies graph-based analysis of functional system structure to find a novel fault-tolerant architecture for an electrical steering where a dedicated AC-motor design and cheap voltage measurements ensure ability to detect all relevant faults. The paper shows how active control reconfiguration can accommodate all critical faults and the fault-tolerant abilities are demonstrated on a warehouse truck hardware.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号