期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Software dependability models under memory faults with application to a digital system in nuclear power plants

Jong Gyun Choi Poong Hyun Seong 《Reliability Engineering & System Safety》1998,59(3):321-329

In this work, software dependability under memory faults in the operational phase is predicted by two models: an analytic model and the stochastic activity network (SAN) model. The analytic model is based on the simple reliability theory and the graph theory, which represents the software as a graph composed of nodes and arcs. Through proper transformation, the graph can be reduced to a simple two-node graph from which software reliability can be derived. The SAN model permits the representation of concurrency, timeliness, fault tolerance, and degradable performance of the system and provides a means for determining the stochastic behavior of a software.Using these models, we predict the reliability of an application software in a digital system, Interposing Logic System (ILS), in a nuclear power plant and show the sensitivity of software reliability to major physical parameters which affect software failure in the normal operation phase. It is found that the effects of hardware faults on software failure should be considered for the accurate prediction of software dependability in the operation phase. 相似文献

2.

A method for evaluating fault coverage using simulated fault injection for digitalized systems in nuclear power plants 总被引：1，自引：0，他引：1

Suk Joon Kim Poong Hyun Seong Jun Seok Lee Man Cheol Kim Hyun Gook Kang Seung Cheol Jang 《Reliability Engineering & System Safety》2006,91(5):614-623

The fault coverage for digital system in nuclear power plants is evaluated using a simulated fault injection method. Digital systems have numerous advantages, such as hardware elements share and hardware replication of the needed number of independent channels. However, the application of digital systems to safety-critical systems in nuclear power plants has been limited due to reliability concerns. In the reliability issues, fault coverage is one of the most important factors. In this study, we propose an evaluation method of the fault coverage for safety-critical digital systems in nuclear power plants. The system under assessment is a local coincidence logic processor for a digital plant protection system at Ulchin nuclear power plant units 5 and 6. The assessed system is simplified and then a simulated fault injection method is applied to evaluate the fault coverage of two fault detection mechanisms. From the simulated fault injection experiment, the fault detection coverage of the watchdog timer is 44.2% and that of the read only memory (ROM) checksum is 50.5%. Our experiments show that the fault coverage of a safety-critical digital system is effectively quantified using the simulated fault injection method. 相似文献

3.

Fault curves—A novel drift-fault diagnosing method for condition monitoring

P. R. Drake J. H. Williams 《Quality and Reliability Engineering International》1991,7(2):89-98

A drift-fault diagnosing method is presented based on fault curves fitted to changes in a single characteristic of a system performance curve such as the peak of the step response. The method consists of fitting a fault curve, for each potential fault in the system, to samples of the characteristic produced under different levels of the faults. Unknown faults are then diagnosed, irrespective of severity, by seeing which fault curve is closest to the characteristic produced by the system under test. It is argued that this method is very appropriate for diagnosing faults detected by condition monitoring systems since these will be drift faults, and because the method can be applied to normal operational signals without the need to inject special test signals. 相似文献

4.

Reliability analysis of fault tolerant memory arrays

Joel A. Nachlas 《Quality and Reliability Engineering International》1985,1(3):191-194

A model is developed to represent computer memory module reliability as a function of memory array reliability under a fault tolerant design. The fault tolerance feature of the array actually results from a revision in the use of the array so that with respect to some failure modes, the array becomes a K out of N rather than a series system. The model is used to determine array reliability under fault tolerance. The ratio of module reliability under fault tolerance to that without this feature is used as a measure of the benefits of revising array use. A key feature of the analysis is the fact that not all faults can be tolerated. The elemental memory devices examined conform to a decreasing Weibull hazard model. Consequently, evaluation of the general model for the K out of N system realized must be done numerically. However, for the special case in which K=N-1, a closed form expression for the performance measure is obtained. This special case occurs for the application of interest and it is shown that the performance measure always exceeds one and depends directly upon the proportion of faults that can be tolerated. Thus the value of fault tolerance is shown to depend upon the extent to which the array will tolerate faults. This provides a basis for deciding whether or not fault tolerance should be implemented. 相似文献

5.

Reliability measurement of mass storage system for onboard instrumentation

Minsu Choi Park N. Piuri V. Lombardi F. 《IEEE transactions on instrumentation and measurement》2005,54(6):2297-2304

Advances in spaceborne vehicular technology have made possible the long-life duration of the mission in harsh cosmic environments. Reliability and data integrity are the commonly emphasized requirements of spaceborne solid-state mass storage systems, because faults due to the harsh cosmic environments, such as extreme radiation, can be experienced throughout the mission. Acceptable dependability for these instruments has been achieved by using redundancy and repair. Reconfiguration (repair) of memory arrays using spare memory lines is the most common technique for reliability enhancement of memories with faults. Faulty cells in memory arrays are known to show spatial locality. This physical phenomenon is referred to as fault clustering . This paper initially investigates a quadrat-based fault model for memory arrays under clustered faults to establish a reliable foundation of measurement. Then, lifelong dependability of a fault-tolerant spaceborne memory system with hierarchical active redundancy, which consists of spare columns in each memory module and redundant memory modules, is measured in terms of the reliability (i.e., the conditional probability that the system performs correctly throughout the mission) and mean-time-to-failure (i.e., the expected time that a system will operate before it fails). Finally, minimal column redundancy search technique for the fault-tolerant memory system is proposed and verified through a series of parametric simulations. Thereby, design and fabrication of cost-effective and highly reliable fault-tolerant onboard mass storage system can be realized for dependable instrumentation. 相似文献

6.

A survey of software dependability

V. V. S. Sarma 《Sadhana》1987,11(1-2):23-48

This paper presents on overview of the issues in precisely defining, specifying and evaluating the dependability of software, particularly in the context of computer controlled process systems. Dependability is intended to be a generic term embodying various quality factors and is useful for both software and hardware. While the developments in quality assurance and reliability theories have proceeded mostly in independent directions for hardware and software systems, we present here the case for developing a unified framework of dependability—a facet of operational effectiveness of modern technological systems, and develop a hierarchical systems model helpful in clarifying this view. In the second half of the paper, we survey the models and methods available for measuring and improving software reliability. The nature of software “bugs”, the failure history of the software system in the various phases of its lifecycle, the reliability growth in the development phase, estimation of the number of errors remaining in the operational phase, and the complexity of the debugging process have all been considered to varying degrees of detail. We also discuss the notion of software fault-tolerance, methods of achieving the same, and the status of other measures of software dependability such as maintainability, availability and safety. 相似文献

7.

基于EMD和分形盒维数的旋转机械耦合故障诊断方法研究

下载免费PDF全文

韩东颖李庚时培明《振动与冲击》2013,32(15):209-214

针对旋转机械耦合故障的诊断问题,提出一种基于EMD（Empirical Mode Decomposition）和分形盒维数的诊断方法。该方法结合EMD对非线性信号处理的自适应性和分形盒维数能对非线性行为定量描述的特点,先对故障信号进行EMD处理,得到含有故障特征的本征模式函数(Intrinsic Mode Function,简称IMF),然后求出各IMF的盒维数,通过盒维数的比较分析进行故障诊断。构造了含有裂纹-碰摩-松动耦合故障的转子-轴承系统动力学模型,用龙格库塔法求出故障模型振动信号。通过对耦合故障信号进行分析,得到耦合故障特征向量,并与传统的边界谱诊断方法比较,证明该方法对旋转机械耦合故障诊断的有效性和优越性。相似文献

8.

Reliability assessment of embedded digital system using multi-state function 总被引：1，自引：0，他引：1

Jong Gyun Choi Poong Hyun Seong 《Reliability Engineering & System Safety》2006,91(3):261-269

This work describes a combinatorial model for estimating the reliability of the embedded digital system by means of multi-state function. This model includes a coverage model for fault-handling techniques implemented in digital systems. The fault-handling techniques make it difficult for many types of components in digital system to be treated as binary state, good or bad. The multi-state function provides a complete analysis of multi-state systems as which the digital systems can be regarded. Through adaptation of software operational profile flow to multi-state function, the HW/SW interaction is also considered for estimation of the reliability of digital system. Using this model, we evaluate the reliability of one board controller in a digital system, Interposing Logic System (ILS), which is installed in YGN nuclear power units 3 and 4. Since the proposed model is a generalized combinatorial model, the simplification of this model becomes the conventional model that treats the system as binary state. This modeling method is particularly attractive for embedded systems in which small sized application software is implemented since it will require very laborious work for this method to be applied to systems with large software. 相似文献

9.

An analysis of safety-critical digital systems for risk-informed design

Hyun Gook Kang Taeyong Sung 《Reliability Engineering & System Safety》2002,78(3)

This paper quantitatively presents the results of a case study which examines the fault tree analysis framework of the safety of digital systems. The case study is performed for the digital reactor protection system of nuclear power plants. The broader usage of digital equipment in nuclear power plants gives rise to the need for assessing safety and reliability because it plays an important role in proving the safety of a designed system in the nuclear industry. We quantitatively explain the relationship between the important characteristics of digital systems and the PSA result using mathematical expressions. We also demonstrate the effect of critical factors on the system safety by sensitivity study and the result which is quantified using the fault tree method shows that some factors remarkably affect the system safety. They are the common cause failure, the coverage of fault tolerant mechanisms and software failure probability. 相似文献

10.

Using run-time reconfiguration for fault injection applications 总被引：1，自引：0，他引：1

Antoni L. Leveugle R. Feher B. 《IEEE transactions on instrumentation and measurement》2003,52(5):1468-1473

相似文献

11.

An efficient BIST method for non-traditional faults of embedded memory arrays

Jone W.-B. Der-Chen Huang Das S.R. 《IEEE transactions on instrumentation and measurement》2003,52(5):1381-1390

In this work, a built-in self-testing (BIST) method is proposed to detect nontraditional faults of embedded memory arrays for a system-on-chip (SoC) design. The nontraditional faults include single-cell read-sensitive faults and read coupling faults. The BIST method can efficiently deal with embedded memory arrays spatially distributed on the entire SoC chip. The concept of redundant read-write operations is applied to detect all embedded memory arrays with different sizes simultaneously. The redundant operations do not affect the fault coverage of all nontraditional faults discussed in this paper. The method has the advantages of low hardware overhead, short test time, and high fault coverage for nontraditional memory defects. 相似文献

12.

Fault-tolerant embedded system design and optimization considering reliability estimation uncertainty 总被引：1，自引：0，他引：1

Naruemon Wattanapongskorn David W. Coit 《Reliability Engineering & System Safety》2007,92(4):395-407

In this paper, we model embedded system design and optimization, considering component redundancy and uncertainty in the component reliability estimates. The systems being studied consist of software embedded in associated hardware components. Very often, component reliability values are not known exactly. Therefore, for reliability analysis studies and system optimization, it is meaningful to consider component reliability estimates as random variables with associated estimation uncertainty. In this new research, the system design process is formulated as a multiple-objective optimization problem to maximize an estimate of system reliability, and also, to minimize the variance of the reliability estimate. The two objectives are combined by penalizing the variance for prospective solutions. The two most common fault-tolerant embedded system architectures, N-Version Programming and Recovery Block, are considered as strategies to improve system reliability by providing system redundancy. Four distinct models are presented to demonstrate the proposed optimization techniques with or without redundancy. For many design problems, multiple functionally equivalent software versions have failure correlation even if they have been independently developed. The failure correlation may result from faults in the software specification, faults from a voting algorithm, and/or related faults from any two software versions. Our approach considers this correlation in formulating practical optimization models. Genetic algorithms with a dynamic penalty function are applied in solving this optimization problem, and reasonable and interesting results are obtained and discussed. 相似文献

13.

A parallel built-in self-diagnostic method for nontraditional faults of embedded memory arrays

Arora V. Jone W.B. Huang D.C. Das S.R. 《IEEE transactions on instrumentation and measurement》2004,53(4):915-932

In this paper, we propose a built-in self-diagnostic march-based algorithm that identifies faulty memory cells based on a recently introduced nontraditional fault model. It is developed based on the DiagRSMarch algorithm, which is a diagnostic algorithm to identify traditional faults for embedded memory arrays. A minimal set of additional operations is added to DiagRSMarch for identifying the nontraditional faults without affecting the diagnostic coverage of the traditional faults. The embedded memory arrays are accessed using a bidirectional serial interfacing architecture which minimizes the routing overhead introduced by the diagnosis hardware. Using the concepts of the bidirectional interfacing technique, parallel testing, and redundant-tolerant operations, the diagnostic process can be accomplished efficiently at-speed with minimal hardware overhead. 相似文献

14.

Modeling and analysis of fault tolerant multistage interconnection networks 总被引：1，自引：0，他引：1

Minsu Choi Park N. Lombardi F. 《IEEE transactions on instrumentation and measurement》2003,52(5):1509-1519

Performance and reliability are two of the most crucial issues in today's high-performance instrumentation and measurement systems. High speed and compact density multistage interconnection networks (MINs) are widely-used subsystems in different applications. New performance models are proposed to evaluate a novel fault tolerant MIN arrangement, thereby assuring performance and reliability with high confidence level. A concurrent fault detection and recovery scheme for MINs is considered by rerouting over redundant interconnection links under stringent real-time constraints for digital instrumentation such as sensor networks. A switch architecture for concurrent testing and diagnosis is proposed. New performance models are developed and used to evaluate the compound effect of fault tolerant operation (inclusive of testing, diagnosis, and recovery) on the overall throughput and delay. Results are shown for single transient and permanent stuck-at faults on links and storage units in the switching elements. It is shown that performance degradation due to fault tolerance is graceful while performance degradation without fault recovery is unacceptable. 相似文献

15.

New approach to adaptive single pole auto-reclosing of power transmission lines

Jamali S. Parham A. 《Generation, Transmission & Distribution, IET》2010,4(1):115-122

In this study, a novel digital algorithm is introduced for recognition of arcing (transient) faults and determination of dead time for adaptive auto-reclosing. The algorithm distinguishes between arcing and permanent faults by using the zero sequence voltage measured at the relaying point. If the fault is recognised as an arcing fault, then the third harmonic of the zero sequence voltage is used to evaluate the extinction time of secondary arc and to initiate reclosing signal. The proposed algorithm uses an adaptive threshold level and therefore no significant adjustment is needed for different transmission systems. Moreover, its performance is independent to fault location, line parameters and the system pre-fault operating conditions. The algorithm has been successfully tested for various faults and operating conditions on a 400 kV overhead line using the electro-magnetic transient program (EMTP). The test results have demonstrated validity of the algorithm in determining the secondary arc extinction time and blocking unsuccessful automatic reclosing during permanent faults. 相似文献

16.

基于HJI方法的感应电动机运行故障检测

姚舜才潘宏侠《测试技术学报》2007,21(1):23-27

运用鲁棒故障检测设计理论中的HJI(Hamilton-Jacobi-lssacs)不等式方法对感应电动机运行故障进行故障检测设计.文中给出了感应电动机的动态数学模型并对其进行扩展,构成了广义系统.在此基础上,使用鲁棒检测理论中的HJI不等式对运行故障进行了理论分析和推导,得出了用于进行故障检测的递推公式.通过对电机定子绕组、径向振动以及瞬时功率等方面的仿真实验研究表明,使用这种检测方法可以提高检测的鲁棒性,避免误检,同时满足各项性能指标的要求,而且检测结果较传统方法有较大提高. 相似文献

17.

Multi-state systems with multi-fault coverage

Gregory Levitin Suprasad V. Amari 《Reliability Engineering & System Safety》2008,93(11):1730-1739

The paper introduces a new model of fault level coverage for multi-state systems in which the effectiveness of recovery mechanisms depends on the coexistence of multiple faults in related elements. Examples of this effect can be found in computing systems, electrical power distribution networks, pipelines carrying dangerous materials, etc. For evaluating reliability and performance indices of multi-state systems with imperfect multi-fault coverage, a modification of the generalized reliability block diagram (RBD) method is suggested. This method, based on a universal generating function technique, allows performance distribution of complex multi-state series–parallel system with multi-fault coverage to be obtained using a straightforward recursive procedure. Illustrative examples are presented. 相似文献

18.

Hardware implementation of fault‐tolerance in dual computer systems

Refik Samet 《Quality and Reliability Engineering International》2009,25(8):1015-1028

In this paper, we propose an architectural design for a dual computer system (DCS) that operates in real‐time with the fault‐tolerance implemented purely by hardware. We have a novel design allowing the implementation of hardware that performs the following key services: the determination of fault type (temporary or permanent) and the localization of the faulty computer without using self‐testing techniques and diagnosis routines. We also propose a non‐trivial sequence of services for fault‐tolerance in which the determination of the fault type and the recovery of computational processes after a temporary fault are realized before fault localization. Our design has several benefits: the designed hardware shortens the recovery point time period; the proposed non‐trivial sequence of fault‐tolerant services reduces (to two) the number of logical segments that should be re‐run to recover the computational processes; and the determination of the fault type allows eliminating only the computer with a permanent fault. These contributions bring both an increase in system performance and an increase in the degree of system reliability. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

19.

Diagnosis Modelling for Dependability Assessment of Fault‐Tolerant Systems Based on Stochastic Activity Networks

下载免费PDF全文

Samia Maza 《Quality and Reliability Engineering International》2015,31(6):963-976

The growing demand for safety, reliability, availability and maintainability in modern technological systems has led these systems to become more and more complex. To improve their dependability, many features and subsystems are employed like the diagnosis system, control system, backup systems, and so on. These subsystems have all their own dynamic, reliability and performances and interact with each other in order to provide a dependable and fault‐tolerant system. This makes the dependability analysis and assessment very difficult. This paper proposes a method to completely model the diagnosis procedure in fault‐tolerant systems using stochastic activity networks. Combined with Monte Carlo simulation, this will allow the dependability assessment by including the diagnosis parameters and performances explicitly. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

20.

Quantitative demonstration and cost considerations of a software fault removal methodology

M. Lipow 《Quality and Reliability Engineering International》1985,1(1):27-35

This paper presents a demonstration of a methodology for fault removal during software development. The methodology encompasses the entire development history, from system and software requirements generation to system test. Thus it considers not only the faults during software testing after formal configuration controls have been invoked, but also the faults discovered prior to that phase: during system and software requirements generation, preliminary design, detailed design and code and unit testing. The agents for fault discovery used in verification and validation are called activities, techniques and tools (AT & Ts) in this paper, each having a certain maximum potential or capability for fault discovery. The AT & Ts considered include the usual specification review activities, and also certain tools not normally applied in ‘standard’ software development, such as automated requirements aids. Application of the methodology yields numbers of residual faults as of each phase of development, including those remaining to be discovered during operations and maintenance. Some previous experience and data on residual faults correspond to these results, indicating that the methodology and choice of parameters are reasonable. The methodology also allows one to calculate a relative loss due to delay in fault discovery, which, as is well known, rises rapidly when faults are not discovered during the phase in which they are generated. 相似文献