共查询到20条相似文献,搜索用时 625 毫秒
1.
System reliability has become a main concern during the computer-based system design process. It is one of the most important characteristics of the system quality. The continuous increase of the system complexity makes the reliability evaluation extremely costly. Therefore, there is need to develop new methods with less cost and effort. Furthermore, the system is vulnerable to both software and hardware faults. While the software faults are usually introduced by the programmer either at the design or the implementation stage of the software, the hardware faults are caused by physical phenomena affecting the hardware components, such as environmental perturbations, manufacturing defects, and aging-related phenomena. The software faults can only impact the software components. However, the hardware faults can propagate through the different system layers, and affect both the hardware and the software. This paper discusses the differences between the software testing and the software fault injections techniques used for reliability evaluation. We describe the mutation analysis as a method mainly used in software testing. Then, we detail the fault injection as a technique to evaluate the system reliability. Finally, we discuss how to use software mutation analysis in order to evaluate, at software level, the system reliability against hardware faults. The main advantage of this technique is its usability at early design stage of the system, when the instruction set architecture is not available. Experimental results run to evaluate faults occurring the memory show that the proposed approach significantly reduces the complexity of the system reliability evaluation in terms of time and cost. 相似文献
2.
故障注入技术在BIT软件测试中是一种有效的手段。针对电路板级BIT软件测试中遇到的问题,介绍了一种基于开源模拟器QEMU实现的处理器类故障模拟方法。采用该方法对多种处理器故障进行仿真建模,通过对QEMU的扩展开发,加入故障行为模拟模块和故障注入模块,以实现一个具有处理器类故障注入功能的系统级模拟器BitVaSim。首先分析处理器功能故障模式,提取故障的关键字值对,用XML Schema定义故障并用于故障建模;其次对QEMU代码进行二次开发以实现对处理器故障行为的模拟;然后通过配置故障注入接口实现模拟器运行时的故障模式匹配、故障按条件触发等功能;最后通过实验案例来观察模拟器的故障表现,评价这种基于模拟器的故障注入技术。实验过程和结果显示这种方法是有效可行的。 相似文献
3.
《IEEE transactions on pattern analysis and machine intelligence》2006,32(11):849-867
The injection of faults has been widely used to evaluate fault tolerance mechanisms and to assess the impact of faults in computer systems. However, the injection of software faults is not as well understood as other classes of faults (e.g., hardware faults). In this paper, we analyze how software faults can be injected (emulated) in a source-code independent manner. We specifically address important emulation requirements such as fault representativeness and emulation accuracy. We start with the analysis of an extensive collection of real software faults. We observed that a large percentage of faults falls into well-defined classes and can be characterized in a very precise way, allowing accurate emulation of software faults through a small set of emulation operators. A new software fault injection technique (G-SWFIT) based on emulation operators derived from the field study is proposed. This technique consists of finding key programming structures at the machine code-level where high-level software faults can be emulated. The fault-emulation accuracy of this technique is shown. This work also includes a study on the key aspects that may impact the technique accuracy. The portability of the technique is also discussed and it is shown that a high degree of portability can be achieved 相似文献
4.
Designers are realizing the advantages of performing fault injection early, using simulation to inject faults into a model of the design rather than the actual system. The authors describe their technique for injecting faults into a system's VHDL behavioral level model. To demonstrate the technique, they evaluate an embedded control system providing fail safe operation in the railway industry 相似文献
5.
为了对集成度高、体积小的单片机系统中的监控程序的容错能力进行考查和验证,提出了硬件控制中断,软件故障注入的新方法。实验结果证明,此故障注入方法成功的模拟了单粒子事件对系统的影响,并成功对系统注入故障,和传统的硬件故障注入法、软件注入法相比,此方法具有操作简单、易控制注入故障深度、实时注入、在线监控分析等突出优点。 相似文献
6.
在无线传感器网络WSN中,可靠性和容错性是评价WSN稳定性的重要指标。在WSN的实际应用中常会发生很多故障(Fault)和干扰,采用故障注入FI技术可以向WSN人为地注入这些故障和干扰,通过观察注入故障后网络的反应来评价网络的可靠性和容错性,从而对网络机制进行改进来提高网络的可靠性和稳定性。本文提出的FISDR是一种采用故障注入的WSN性能评估系统,基于软件故障注入方法,采用一对一的方式通过特殊接口与WSN节点连接,向WSN节点注入故障命令。该系统一是可以有效地向WSN注入各种实际应用时可能遇到的故障和干扰并观察网络运行的状况;二是可以接收包括WSN节点和其它各种设备通过特殊接口发来的数据,并将其存储;三是配有上位机软件对网络拓扑结构进行监控、对传输成功率进行统计并对存储的大量信息进行分析,从而对WSN网络及其可靠性做出评价。本系统在一栋五层办公楼分别用数十个WSN节点和FISDR节点做实验,实验内容包括使用FISDR向WSN注入大规模的故障并统计网络的反应状况,验证FISDR故障注入的效果,从而对FISDR的性能进行测试和分析。实验结果表明,FISDR可以有效地向WSN注入各种故障以评价其可靠性,在测试WSN及其可靠性评价方面有很高的应用价值。 相似文献
7.
Context
Mutation testing is a fault-injection-based technique to help testers generate test cases for detecting specific and predetermined types of faults.Objective
Before mutation testing can be effectively applied to embedded systems, traditional mutation testing needs to be modified. To inject a fault into an embedded system without causing any system failure or hardware damage is a challenging task as it requires some knowledge of the underlying layers such as the kernel and the corresponding hardware.Method
We propose a set of mutation operators for embedded systems using kernel-based software and hardware fault simulation. These operators are designed for software developers so that they can use the mutation technique to test the entire system after the software is integrated with the kernel and hardware devices.Results
A case study on a programmable logic controller for a digital reactor protection system in a nuclear power plant is conducted. Our results suggest that the proposed mutation operators are useful for fault-injection and this is evidenced by the fact that faults not injected by us were discovered in the subject software as a result of the case study.Conclusion
We conclude that our mutation operators are useful for integration testing of an embedded system. 相似文献8.
Carreira J. Madeira H. Silva J.G. 《IEEE transactions on pattern analysis and machine intelligence》1998,24(2):125-136
An important step in the development of dependable systems is the validation of their fault tolerance properties. Fault injection has been widely used for this purpose, however with the rapid increase in processor complexity, traditional techniques are also increasingly more difficult to apply. This paper presents a new software-implemented fault injection and monitoring environment, called Xception, which is targeted at modern and complex processors. Xception uses the advanced debugging and performance monitoring features existing in most modern processors to inject quite realistic faults by software, and to monitor the activation of the faults and their impact on the target system behavior in detail. Faults are injected with minimum interference with the target application. The target application is not modified, no software traps are inserted, and it is not necessary to execute the target application in special trace mode (the application is executed at full speed). Xception provides a comprehensive set of fault triggers, including spatial and temporal fault triggers, and triggers related to the manipulation of data in memory. Faults injected by Xception can affect any process running on the target system (including the kernel), and it is possible to inject faults in applications for which the source code is not available. Experimental, results are presented to demonstrate the accuracy and potential of Xception in the evaluation of the dependability properties of the complex computer systems available nowadays 相似文献
9.
Kao W.-I. Iyer R.K. Tang D. 《IEEE transactions on pattern analysis and machine intelligence》1993,19(11):1105-1118
The authors present a fault injection and monitoring environment (FINE) as a tool to study fault propagation in the UNIX kernel. FINE injects hardware-induced software errors and software faults into the UNIX kernel and traces the execution flow and key variables of the kernel. FINE consists of a fault injector, a software monitor, a workload generator, a controller, and several analysis utilities. Experiments on SunOS 4.1.2 are conducted by applying FINE to investigate fault propagation and to evaluate the impact of various types of faults. Fault propagation models are built for both hardware and software faults. Transient Markov reward analysis is performed to evaluate the loss of performance due to an injected fault. Experimental results show that memory and software faults usually have a very long latency, while bus and CPU faults tend to crash the system immediately. About half of the detected errors are data faults, which are detected when the system is tries to access an unauthorized memory location. Only about 8% of faults propagate to other UNIX subsystems. Markov reward analysis shows that the performance loss incurred by bus faults and CPU faults is much higher than that incurred by software and memory faults. Among software faults, the impact of pointer faults is higher than that of nonpointer faults 相似文献
10.
Fault injection is an effective method for studying the effects of faults in computer systems and for validating fault-handling mechanisms. The approach presented involves injecting transient faults into integrated circuits by using heavy-ion radiation from a Californium-252 source. The proliferation of safety-critical and fault-tolerant systems using VLSI technology makes such attempts to inject faults at internal locations in VLSI circuits increasingly important 相似文献
11.
为有效处理工业以太网通信中的瞬时故障,提出了一种基于芯片、节点和系统的多层次瞬时故障处理机制.根据工业以太网通信中瞬时故障的特点,从多个层次对瞬时故障的致因进行了分析.在此基础上设计了相应的处理方法,芯片层利用芯片提供的硬件逻辑结合软件技术自动调整芯片工作状态;节点层运用软件看门狗以及软件冗余;系统层定义特殊的帧格式和设置定时器.实验结果表明,多层次瞬时故障处理机制能有效降低网络的丢包率,提高了系统的可靠性. 相似文献
12.
13.
14.
Woodbury M.H. Shin K.G. 《IEEE transactions on pattern analysis and machine intelligence》1990,16(2):212-216
The authors demonstrate the need to address fault latency in highly reliable real-time control computer systems. It is noted that the effectiveness of all known recovery mechanisms is greatly reduced in the presence of multiple latent faults. The presence of multiple latent faults increases the possibility of multiple errors, which could result in coverage failure. The authors present experimental evidence indicating that the duration of fault latency is dependent on workload. A synthetic work generator is used to vary the workload, and a hardware fault injector is applied to inject transient faults of varying durations. This method makes it possible to derive the distribution of fault latency duration. Experimental results obtained from the fault-tolerant multiprocessor at the NASA Airlab are presented and discussed 相似文献
15.
16.
Kalbarczyk Z. Iyer R.K. Ries G.L. Patel J.U. Lee M.S. Xiao Y. 《IEEE transactions on pattern analysis and machine intelligence》1999,25(5):619-632
This paper presents a hierarchical simulation methodology that enables accurate system evaluation under realistic faults and conditions. In this methodology, effects of low-level (i.e., transistor or circuit level) faults are propagated to higher levels (i.e., system level) using fault dictionaries. The primary fault models are obtained via simulation of the transistor-level effect of a radiation particle penetrating a device. The resulting current bursts constitute the first-level fault dictionary and are used in the circuit-level simulation to determine the impact on circuit latches and flip-flops. The latched outputs constitute the next level fault dictionary in the hierarchy and are applied in conducting fault injection simulation at the chip-level under selected workloads or application programs. Faults injected at the chip-level result in memory corruptions, which are used to form the next level fault dictionary for the system-level simulation of an application running on simulated hardware. When an application terminates, either normally or abnormally, the overall fault impact on the software behavior is quantified and analyzed. The system in this sense can be a single workstation or a network. The simulation method is demonstrated and validated in the case study of Myrinet (a commercial, high-speed network) based network system 相似文献
17.
Virtual memory was developed to automate the movement of program code and data between main memory and secondary storage to give the appearance of a single large store. This technique greatly simplified the programmer's job, particularly when program code and data exceeded the main memory's size. Virtual memory has now become widely used, and most modern processors have hardware to support it. Unfortunately, there has not been much agreement on the form that this support should take. The result of this lack of agreement is that hardware mechanisms are often completely incompatible. Thus, designers and porters of system level software have two somewhat unattractive choices: they can write software to fit many different architectures or they can insert layers of software to emulate a particular hardware interface. The authors present the software mechanisms of virtual memory from a hardware perspective and then describe several hardware examples and how they support virtual memory software. Their focus is to show the diversity of virtual memory support and, by implication, how this diversity complicates the design and porting of OSs. The authors introduce basic virtual memory technologies and then compare memory management designs in three commercial microarchitectures. They show the diversity of virtual memory support and, by implication, how this diversity can complicate and compromise system operations 相似文献
18.
以单粒子翻转为代表的软错误是制约COTS器件空间应用的主要因素之一;为了满足空间应用对高集成卫星电子系统抗辐照防护的要求,提出了一种面向通用多核处理器的单粒子翻转加固方法,通过软件层面双核互检,在不额外增加硬件开销的前提下,充分提高了COTS器件的可靠性,具有良好的可移植性和较强的工程实用价值;进行软件故障注入实验,在程序执行的关键节点注入错误信息,验证该双核互检方法实用性;实验结果表明双核互锁方法可以100%检测出系统中产生的单粒子翻转,抗软错误能力满足应用需要。 相似文献
19.
Roger T. Alexander Jeff Offutt Andreas Stefik 《Software Testing, Verification and Reliability》2010,20(4):291-327
As we move toward developing object‐oriented (OO) programs, the complexity traditionally found in functions and procedures is moving to the connections among components. Different faults occur when components are integrated to form higher‐level structures that aggregate the behavior and state. Consequently, we need to place more effort on testing the connections among components. Although OO technologies provide abstraction mechanisms for building components that can then be integrated to form applications, it also adds new compositional relations that can contain faults. This paper describes techniques for analyzing and testing the polymorphic relationships that occur in OO software. The techniques adapt traditional data flow coverage criteria to consider definitions and uses among state variables of classes, particularly in the presence of inheritance, dynamic binding, and polymorphic overriding of state variables and methods. The application of these techniques can result in an increased ability to find faults and to create an overall higher quality software. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献
20.
软件缺陷及软件可靠性技术 总被引:8,自引:0,他引:8
近年来的研究表明,系统发生的失效更多的是由于软件缺陷引起的。因此软件可靠性成为系统可靠性的关键,也是高可靠性和高可用性系统的一个主要的研究内容。该文在描述了软件可靠性与硬件可靠性不同的基础上,对软件可靠性模型、软件缺陷,以及软件可靠性技术进行了总结和论述。 相似文献