共查询到20条相似文献,搜索用时 15 毫秒
1.
A New Real-Time Reliability Prediction Method for Dynamic Systems Based on On-Line Fault Prediction 总被引:1,自引:0,他引:1
Zhengguo Xu Yindong Ji Donghua Zhou 《Reliability, IEEE Transactions on》2009,58(3):523-538
While a specific system is in use, its reliability will decrease gradually after the infant mortality period because of the components' degradation, or external attacks. Thus, reliability is a natural characteristic of a system's health, and can be used for condition monitoring & predictive maintenance. This paper introduces a new real-time reliability prediction method for dynamic systems which incorporates an on-line fault prediction algorithm. The factors that may reduce a system's reliability are modeled as an additive fault input to the system, and the fault is assumed to be varying linearly with time, approximately. The time-varying fault is roughly estimated based on a modified particle filtering algorithm at first. Then, as a time series, the fault estimate sequence is smoothed, and predicted by an exponential smoothing method. Mathematical analysis shows that the effects of the system, and measurement noises on the fault estimates are greatly reduced by exponential smoothing, which indicates that the comparatively high accuracy of the fault estimates & predictions is guaranteed. Based on the particle filtering & fault prediction results, the whole system's predictive reliability is computed through a Monte Carlo simulation strategy. The effectiveness of the proposed real-time reliability prediction method is validated by a computer simulation of a three-vessel water tank system. 相似文献
2.
This paper presents a new method for incorporating imperfect FC (fault coverage) into a combinatorial model. Imperfect FC, the probability that a single malicious fault can thwart automatic recovery mechanisms, is important to accurate reliability assessment of fault-tolerant computer systems. Until recently, it was thought that the consideration of this probability necessitated a Markov model rather than the simpler (and usually faster) combinatorial model. SEA, the new approach, separates the modeling of FC failures into two terms that are multiplied to compute the system reliability. The first term, a simple product, represents the probability that no uncovered fault occurs. The second term comes from a combinatorial model which includes the covered faults that can lead to system failure. This second term can be computed from any common approach (e.g. fault tree, block diagram, digraph) which ignores the FC concept by slightly altering the component-failure probabilities. The result of this work is that reliability engineers can use their favorite software package (which ignores the FC concept) for computing reliability, and then adjust the input and output of that program slightly to produce a result which includes FC. This method applies to any system for which: the FC probabilities are constant and state-independent; the hazard rates are state-independent; and an FC failure leads to immediate system failure 相似文献
3.
This paper discusses the operation of a chemical reactor and particularly the need for highly reliable instrumentation and control for critical processes. It describes the use of fault tolerance for the computer control to provide a level of reliability previously unachievable with standard computer control systems. The provision of highly reliable interface equipment to the reactor itself is described and approaches are presented for solving the problem of faults in the sensors and actuators. The paper discusses a specific example chemical reactor and the benefits that are achievable using a fault tolerant control computer system. 相似文献
4.
Fault-tree analysis: a knowledge-engineering approach 总被引:1,自引:0,他引:1
This paper deals with the application of knowledge engineering and a methodology for the assessment and measurement of reliability, availability, maintainability, and safety of industrial systems using fault-tree representation. Object oriented structures, production rules representing the expert's heuristics, algorithms, and database structures are the basic elements of the system. The blackboard architecture of the system supports qualitative and quantitative evaluation of the fault tree. A fuzzy set approach analyzes problems with few failure data or much fuzziness or imprecision. Fault-tree analysis is a knowledge acquisition structure that has been extensively explored by knowledge engineers. Reliability engineers can apply the techniques developed by this area of computer science to: (1) improve the data acquisition process; (2) explore the benefits of object oriented expert systems for reliability applications; (3) integrate the several sources of knowledge into a unique system; (4) explore the approximate reasoning to handle uncertainty; and (5) develop hybrid solution strategies combining expert heuristics, conventional procedures, and available failure data 相似文献
5.
Yinong Chen Tinghuai Chen 《Reliability, IEEE Transactions on》1990,39(2):217-225
An NMRC (N-modular redundancy with comparison) system is presented. It surpasses all existing NMR systems in fault tolerance capability. The extra logic needed by NMRC is simpler than that of the other NMR systems. The relation between the interconnection topology and the fault tolerance capability of the NMRC system investigated. Three types of optimal NMRC systems and their characterization and structure are studied. NMRC can be viewed as a diagnosable system. The comparison approach is applied to t 1/s -diagnosable systems, whereas previously it had been applied only to t 0 - and t 1-diagnosable systems. A laboratory 3-MRC system has been built as a node computer for a fault-tolerant multicomputer system for industrial process control. The test results confirm the high reliability and effectiveness of NMRC 相似文献
6.
Fault-tolerant control system of flexible arm for sensor fault by using reaction force observer 总被引:1,自引:0,他引:1
In recent years, control system reliability has received much attention with increase of situations where computer-controlled systems such as robot control systems are used. In order to improve reliability, control systems need to have abilities to detect a fault (fault detection) and to maintain the stability and the control performance (fault tolerance). In this paper, we address the vibration suppression control of a one-link flexible arm robot. Vibration suppression is realized by an additional feedback of a strain gauge sensor attached to the arm besides motor position. However, a sensor fault (e.g., disconnection) may degrade the control performance and make the control system unstable at its worst. In this paper, we propose a fault-tolerant control system for strain gauge sensor fault. The proposed control system estimates a strain gauge sensor signal based on the reaction force observer and detects the fault by monitoring the estimation error. After fault detection, the proposed control system exchanges the faulty sensor signal for the estimated one and switches to a fault-mode controller so as to maintain the stability and the control performance. We apply the proposed control system to the vibration suppression control system of a one-link flexible arm robot and confirm the effectiveness of the proposed control system by some experiments. 相似文献
7.
可靠性是评价容错计算机的重要性能指标之一,评价系统的可靠性在计算机系统的设计及实现阶段都有重要意义,故障注入法是可靠性评测的一种常用方法。在通用的JTAG调试技术基础上,描述了一种针对CPU的硬件故障工具,并通过仿真实验进行了验证。该硬件注入工具基于IEEE标准,只要知道目标芯片的边界扫描链,就可以进行故障注入工作;同时,该工具对目标系统的故障注入工作由硬件完成,对操作系统透明,可以有效地突破操作系统的保护机制。 相似文献
8.
《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1978,66(10):1160-1177
Many computer systems have turned increasingly to control systems, requiring more sophisticated machinery over an ever-widening range. The reliability of the system should be carefully considered in all its aspects. The concept of a control segment is introduced against local malfunctions or failures in order to save the system from going down. This paper also describes the fault tolerant control computer system with dual operation aimed at getting the fail-safe feature of the control output and complex operation features for maintenance and improvement of the system. Three "off-the-shelf" computers are connected symmetrically by three special hardwares and configuration programs not only in hardware, but also software configuration. Degradation techniques in software and hardware are introduced to prevent the system from stoppage. The actual running record shows 99.99 percent operating availability for three years in the case of the command and control system for a railway which is described here. 相似文献
9.
A systematic approach to construct fault trees for advanced process control systems is presented in this paper. For illustration purpose, the proposed method is explained with a specific feedback scheme, i.e., the cascade control strategy. The digraph configuration of a standard system is first described and analyzed in detail. On the basis of a series of qualitative simulation studies, all failure mechanisms can be identified and summarized with a set of generalized fault-tree structures. The fault trees produced with the conventional digraph-based techniques are shown to be not as comprehensive as the ones constructed with the proposed approach. To demonstrate the correctness of our analysis, the successful application of the proposed structures to a heat exchange process is presented. In addition, the resulting fault tree is compared with one obtained from a single-loop feedback control system and the trade-off between the two in system reliability and control performance is assessed accordingly. 相似文献
10.
Real-time computer systems are often used in harsh environments, such as aerospace, and in industry. Such systems are subject to many transient faults while in operation. Checkpointing enables a reduction in the recovery time from a transient fault by saving intermediate states of a task in a reliable storage facility, and then, on detection of a fault, restoring from a previously stored state. The interval between checkpoints affects the execution time of the task. Whereas inserting more checkpoints and reducing the interval between them reduces the reprocessing time after faults, checkpoints have associated execution costs, and inserting extra checkpoints increases the overall task execution time. Thus, a trade-off between the reprocessing time and the checkpointing overhead leads to an optimal checkpoint placement strategy that optimizes certain performance measures. Real-time control systems are characterized by a timely, and correct, execution of iterative tasks within deadlines. The reliability is the probability that a system functions according to its specification over a period of time. This paper reports on the reliability of a checkpointed real-time control system, where any errors are detected at the checkpointing time. The reliability is used as a performance measure to find the optimal checkpointing strategy. For a single-task control system, the reliability equation over a mission time is derived using the Markov model. Detecting errors at the checkpointing time makes reliability jitter with the number of checkpoints. This forces the need to apply other search algorithms to find the optimal number of checkpoints. By considering the properties of the reliability jittering, a simple algorithm is provided to find the optimal checkpoints effectively. Finally, the reliability model is extended to include multiple tasks by a task allocation algorithm 相似文献
11.
Ching-Yao Chan 《Vehicular Technology, IEEE Transactions on》2002,51(1):180-193
Safety systems for ground vehicles are deployed in different phases according to the timing of activation relative to the occurrence instant of an accident. Collision warning or avoidance systems function prior to an accident, while occupant protection systems act during a collision to mitigate the damage or injuries caused by an accident. This paper deals with crash sensing systems that detect a collision and evaluate the severity of a crash. One major application of these sensing systems is their current use in occupant restraint systems. They may also be utilized in the future for advanced vehicle control and safety systems. With air bags becoming standard equipment in new passenger vehicles, crash sensing technologies have advanced considerably. Yet, existing challenges and new innovations continue to demand improvements in their functions. This paper focuses on the system performance of crash sensing systems. The purpose of this paper is to propose a framework of addressing various design issues from both a component level and a system perspective. Through the discussions of crash data analysis, the design concepts of crash sensors are highlighted. The characteristics of representative mechanical and electronic sensors are analyzed and the guidelines of sensor selection to meet design requirements are discussed. Also, an assessment of sensor reliability is reviewed with various system architectures. Finally, suggestions are made to enhance system performance in areas that may benefit from the addition of sensing functionality. The public has become conscientiously aware of the importance of transportation safety. With more advanced technologies introduced into ground vehicles, the safety concerns will intensify. The demand for a friendly driving environment and vehicle interior will further promote the requirements of vehicular safety systems. Crash sensing will remain a challenging and active area for years to come 相似文献
12.
The assessment of bit error rate (BER) performance of a digital communication system via computer simulation has traditionally been done using the Monte Carlo method. For very low BER, this method requires excessive computer time. This time can be substantially reduced by using extrapolation based on importance sampling (IS). In applying IS to a complex system, many considerations must be addressed, chief among which is the reliability (variance) of the estimator as a function of the system particulars. We discuss a number of these considerations and, specifically, derive a number of expressions for the variance. We find that the variance improvement may be severely limited by the dimensionality (or memory) of the system. We describe a means for circumventing this limitation through the definition of a statistically equivalent impulse response. For a linear system, this amounts to the ordinary impulse response. The simulation can be structured to estimate the equivalent impulse response using statistical regression. This new approach has been implemented and found to yield significant runtime improvement over conventional importance sampling for linear systems of large dimensionality. We believe this technique will work also for mildly nonlinear systems, as might be encountered in typical satellite Communications. 相似文献
13.
14.
Metrics are commonly used in engineering as measures of the performance of a system for a given attribute. For instance, in the assessment of fault tolerant systems, metrics such as the reliability, R(t) and the Mean Time To Failure (MTTF) are well-accepted as a means to quantify the fault tolerant attributes of a system with an associated failure rate, /spl lambda/. Unfortunately, there does not seem to be a consensus on comparable metrics to use in the assessment of safety-critical systems. The objective of this paper is to develop two metrics that can be used in the assessment of safety-critical systems, the steady-state safety, S/sub ss/, and the Mean Time To Unsafe Failure (MTTUF). S/sub ss/ represents the evaluation of the safety as a function of time, in the limiting case as time approaches infinity. The MTTUF represents the average or mean time that a system will operate safely before a failure that produces an unsafe system state. A 3-state Markov model is used to model a safety-critical system with the transition rates computed as a function of the system coverage C/sub sys/, and the hazard rate /spl lambda/(t). Also, /spl lambda/(t) is defined by the Weibull distribution, primarily because it allows one to easily represent the scenarios where the failure rate is increasing, decreasing, and constant. The results of the paper demonstrate that conservative estimates for lower bounds for both S/sub ss/ & the MTTUF result when C/sub sys/ is assumed to be a constant regardless of the behavior of /spl lambda/(t). The derived results are then used to evaluate three example systems. 相似文献
15.
This article will present a computerized reliability analysis tool for large control systems. It will also show a new dynamic representation of system structure. It enables us to model the physical system only once for any number of control tasks. The algorithm for computing minimal cut sets for the control tasks has been developed and automated. The result is RELVEC, an interactive computer program that performs reliability/availability calculation, sensitivity analysis and critical component identification. It can handle two repair policies and common mode failures. Reconfiquring of the physical system or the control tasks is simple. RELVEC is becoming an everyday tool in control system reliability analysis at VTT. 相似文献
16.
17.
We suggest a computationally efficient and flexible strategy for assessment of reliability of integrated circuits. The concept of hierarchical reliability analysis proposed relies on doing reliability assessments during the design and layout process [reliability computer aided design (RCAD)]. Design rules are suggested based on calculations of steady-state mechanical stresses built up in interconnect graphs and trees due to electromigration. These design rules identify a large fraction of interconnect graphs in a typical design as immune to electromigration-induced failure. The stated design rules are an extension of the Blech-length concept to interconnect graphs. Our suggested new strategy will have important implications for design and layout processes as design limits for a given technology are reached 相似文献
18.
当今空间计算机必须具有强实时下的高速处理能力和自主工作方式下的高可靠性,而对长寿命卫星而言,其可靠性要求使得任何一种模式的单机结构都难以胜任,于是各种各样的冗余方案溶进了星载计算机设计中,而有目的地识别和选择一种结构使其在有限资源的条件下最大限度地实现容错,同时又能达到所要求的性能,这正是本文所追求的目标,这里阐明的是一种模块化的容错结构,它使用简易的冗余内总线.将不同功能的冗余模块紧密地耦合在一起,从而使系统级的性能可以很方便的进行扩展,功能上可以灵活地实现集中或分布,从而达到了既适应空间计算和控制要求,又满足容错的性能要求的目标。 相似文献
19.