期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Error detection by duplicated instructions in super-scalarprocessors

Oh N. Shirvani P.P. McCluskey E.J. 《Reliability, IEEE Transactions on》2002,51(1):63-75

This paper proposes a pure software technique "error detection by duplicated instructions" (EDDI), for detecting errors during usual system operation. Compared to other error-detection techniques that use hardware redundancy, EDDI does not require any hardware modifications to add error detection capability to the original system. EDDI duplicates instructions during compilation and uses different registers and variables for the new instructions. Especially for the fault in the code segment of memory, formulas are derived to estimate the error-detection coverage of EDDI using probabilistic methods. These formulas use statistics of the program, which are collected during compilation. EDDI was applied to eight benchmark programs and the error-detection coverage was estimated. Then, the estimates were verified by simulation, in which a fault injector forced a bit-flip in the code segment of executable machine codes. The simulation results validated the estimated fault coverage and show that approximately 1.5% of injected faults produced incorrect results in eight benchmark programs with EDDI, while on average, 20% of injected faults produced undetected incorrect results in the programs without EDDI. Based on the theoretical estimates and actual fault-injection experiments, EDDI can provide over 98% fault-coverage without any extra hardware for error detection. This pure software technique is especially useful when designers cannot change the hardware, but they need dependability in the computer system. To reduce the performance overhead, EDDI schedules the instructions that are added for detecting errors such that "instruction-level parallelism" (ILP) is maximized. Performance overhead can be reduced by increasing ILP within a single super-scalar processor. The execution time overhead in a 4-way super-scalar processor is less than the execution time overhead in the processors that can issue two instructions in one cycle 相似文献

2.

CFCET: A hardware-based control flow checking technique in COTS processors using execution tracing 总被引：1，自引：0，他引：1

Amir Rajabzadeh Seyed Ghassem Miremadi 《Microelectronics Reliability》2006,46(5-6):959-972

This paper presents a behavioral-based error detection technique called control flow checking by execution tracing (CFCET) to increase concurrent error detection capabilities of commercial off-the-shelf (COTS) processors. This technique traces the program jumps graph (PJG) at run-time and compares it with the reference jumps graph to detect possible violations caused by transient faults. The reference graph is driven by a preprocessor from the source program.The idea behind the CFCET is based on using an external watchdog processor (WDP) and also the internal execution tracing feature available in COTS processors to monitor the addresses of taken branches in a program, externally. This is done without any modification of application programs, thus, the program overhead is zero. This technique is analytically evaluated based on three different fault models. The results show that the error detection coverage varies between 79.74% and 96.43% depending on the different workload programs. The errors are detected with about zero latency. The external hardware overhead is about 3% using the Altera flex 10K30 FPGA and the execution time overhead is between 33.26% and 140.81% for different workload programs. The overheads have been measured experimentally by executing the workloads on a Pentium system. 相似文献

3.

Evaluating processor-behavior and three error-detection mechanismsusing physical fault-injection

Miremadi G. Torin J. 《Reliability, IEEE Transactions on》1995,44(3):441-454

An approach for assessing the impact of physical injection of transient faults on processor execution is described and evaluated. The fault injection is based on two complementary methods using: (1) heavy-ion radiation; and (2) power supply disturbances. 12000 transient faults were injected into the target microprocessor, a Motorola MC6809E 8-bit CPU, running 3 different workloads. In the evaluation, the control-flow errors were distinguished from those that had no effect on the correct flow of control. The errors that led to wrong results are separated from those that did not affect the correct results. The errors that affected neither the correct control flow nor the correct results are specified. Effects of errors on the registers and signals of the processor are characterized, Workload dependency on error rates is demonstrated. Three error-detection mechanisms, (2 software-based mechanisms and 1 watchdog timer) were combined and used to characterize the detected and undetected errors. More than 87% of all errors and 93% of the control-flow errors could be detected. In a different test, the efficiency of an isolated watchdog timer was evaluated. The coverage of the isolated watchdog timer was only 62%. The results indicate that fault-injection methods, workloads, and programming languages all differently affect the control flow, coverage, latency, and error rates 相似文献

4.

On the Importance of Eliminating Errors in Cryptographic Computations 总被引：2，自引：0，他引：2

Dan Boneh Richard A. DeMillo Richard J. Lipton 《Journal of Cryptology》2001,14(2):101-119

We present a model for attacking various cryptographic schemes by taking advantage of random hardware faults. The model consists of a black-box containing some cryptographic secret. The box interacts with the outside world by following a cryptographic protocol. The model supposes that from time to time the box is affected by a random hardware fault causing it to output incorrect values. For example, the hardware fault flips an internal register bit at some point during the computation. We show that for many digital signature and identification schemes these incorrect outputs completely expose the secrets stored in the box. We present the following results: (1) The secret signing key used in an implementation of RSA based on the Chinese Remainder Theorem (CRT) is completely exposed from a single erroneous RSA signature, (2) for non-CRT implementations of RSA the secret key is exposed given a large number (e.g. 1000) of erroneous signatures, (3) the secret key used in Fiat—Shamir identification is exposed after a small number (e.g. 10) of faulty executions of the protocol, and (4) the secret key used in Schnorr's identification protocol is exposed after a much larger number (e.g. 10,000) of faulty executions. Our estimates for the number of necessary faults are based on standard security parameters such as a 1024-bit modulus, and a 2 ^-40 identification error probability. Our results demonstrate the importance of preventing errors in cryptographic computations. We conclude the paper with various methods for preventing these attacks. Received July 1997 and revised August 2000 Online publication 27 November, 2000 相似文献

5.

System reliability analysis of an N-version programming application

Dugan J.B. Lyu M.R. 《Reliability, IEEE Transactions on》1994,43(4):513-519

This paper presents a quantitative reliability analysis of a system designed to tolerate both hardware and software faults. The system achieves integrated fault tolerance by implementing N-version programming (NVP) on redundant hardware. The system analysis considers unrelated software faults, related software faults, transient hardware faults, permanent hardware faults, and imperfect coverage. The overall model is Markov in which the states of the Markov chain represent the long-term evolution of the system-structure. For each operational configuration, a fault-tree model captures the effects of software faults and transient hardware faults on the task computation. The software fault model is parameterized using experimental data associated with a recent implementation of an NVP system using the current design paradigm. The hardware model is parameterized by considering typical failure rates associated with hardware faults and coverage parameters. The authors results show that it is important to consider both hardware and software faults in the reliability analysis of an NVP system, since these estimates vary with time. Moreover, the function for error detection and recovery is extremely important to fault-tolerant software. Several orders of magnitude reduction in system unreliability can be observed if this function is provided promptly 相似文献

6.

The use of minicomputers in racetrack totalisator systems

《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1973,61(11):1626-1633

A racetrack totslisator is described which is capable of processing 300 bets/s from up to 1000 ticket-issuing machines. The system contains minicomputers which provide high-reliability low-cost computing power, capable of supporting the requirements of any racetrack in the United States or Canada. The computers are duplexed. Both computers process every bet and operator action. Either computer can continue processing if the other fails. The software provides extensive error checking of all hardware components. All bets are logged to magnetic tape, which can be used to restart the system in case of a system "blow." The real-time program was written in machine language and a specially developed higher level language. The master file contains 35 000 instructions. A typical program may contain 20 000 words of program and data tables. An off-Hne program is used to generate individual systems, tailored to the requirements of specific racetracks. 相似文献

7.

A HW/SW Cross-Layer Approach for Determining Application-Redundant Hardware Faults in Embedded Systems

Christian Bartsch Carlos Villarraga Dominik Stoffel Wolfgang Kunz 《Journal of Electronic Testing》2017,33(1):77-92

Hardware devices of recent technology nodes are intrinsically more susceptible to faults than previous devices. This demands further improvements of error detection methods. However, any attempt to cover all errors for all theoretically possible scenarios that a system might be used in can easily lead to excessive costs. Instead, an application-dependent approach should be taken, i.e., strategies for test and error resilience must target only those errors that can actually have an effect in the situations in which the hardware is being used. In this paper, we propose a method to inject faults into hardware (HW) and to formally analyze their effects on the software (SW) behavior. We describe how this analysis can be implemented based on a recently proposed HW-dependent software model called program netlist (PN). We show how program netlists can be extended to formally model the behavior of a program in the event of one or more hardware faults. Then, it is demonstrated how the results of the PN-based analysis can be exploited by a standard ATPG tool to determine hardware faults at the gate level that are “application-redundant”. Our experimental results show the feasibility of the proposed approach. 相似文献

8.

"看门狗"技术在某型相机导航数据接口板中的应用

赵育良许兆林《国外电子元器件》2005,(12):44-46

介绍一种将软件和硬件相结合实现的“看门狗”技术在某型相机导航数据接口板中的应用。结合实际应用给出硬件电路和软件流程。实践证明，采用该技术可使429接口板具有较高的抗干扰性及高可靠性。相似文献

9.

Transient detection in COTS processors using software approach

Amir Rajabzadeh Seyed Ghassem Miremadi 《Microelectronics Reliability》2006,46(1):124-133

This paper presents a software-based error detection scheme called enhanced committed instructions counting (ECIC) for embedded and real-time systems using commercial off-the-shelf (COTS) processors. The scheme uses the internal performance monitoring features of a processor, which provides the ability to count the number of committed instructions in a program. To evaluate the ECIC scheme, 6000 software induced faults are injected into a 32-bit Pentium® processor. The results show that the error detection coverage varies between 90.52% and 98.18%, for different workloads. 相似文献

10.

X5045芯片在单片机系统中的应用 总被引：3，自引：0，他引：3

周向红《现代电子技术》2006,29(5):111-112,116

在控制系统中,尤其是处于无人职守状态的控制仪器、仪表中,“看门狗”电路是必不可少的。介绍了Xicor公司推出的带E2PROM的“看门狗”芯片,X5045是一种集上电复位、看门狗、电压监控和串行E2PROM四种功能于一身的可编程芯片,并说明以51内核单片机为微处理器系统的硬件接口电路、相应的软件编程及在温控系统中的应用。相似文献

11.

Hardware and Software Transparency in the Protection of Programs Against SEUs and SETs

Eduardo Luis Rhod Carlos Arthur Lang Lisbôa Luigi Carro Matteo Sonza Reorda Massimo Violante 《Journal of Electronic Testing》2008,24(1-3):45-56

Processor cores embedded in systems-on-a-chip (SoCs) are often deployed in critical computations, and when affected by faults they may produce dramatic effects. When hardware hardening is not cost-effective, software implemented hardware fault tolerance (SIHFT) can be a solution to increase SoCs’ dependability, but it increases the time for running the hardened application, as well as the memory occupation. In this paper we propose a method that eliminates the memory overhead, by exploiting a new approach to instruction hardening and control flow checking. The proposed method hardens an application online during its execution, without the need for introducing any change in its source code, and is non-intrusive, since it does not require any modification in the main processor’s architecture. The method has been tested with two widely used architectures: a microcontroller and a RISC processor, and proven to be suitable for hardening SoCs against transient faults and also for detecting permanent faults. 相似文献

12.

On the relation of errors and its syndrome in signature analysis

John C. Chan 《Microelectronics Reliability》1992,32(10)

This paper deals with fault diagnostics in signature analysis of computer hardware testing. When an incorrect signature is observed, it is often desired to identify the bit locations, within the input data stream, where errors have caused the faulty signature. One method is common use in the exhaustive matching test in which all bit locations of the input sequence are compared sequentially with the expected response. In this paper, an algorithm of fault diagnostics is presented by making use of the information from the faulty signature. The idea is to search the likely error locations before the tests are performed. The method reduces the number of tests required to diagnose the errors with the probability of aliasing. Such probability is always smaller than that of error detection in signature analysis. When matching tests are difficult or impossible, the method provides an estimate of where errors might have occurred that caused the incorrect signature. Also, the case of “don't cares” at the input sequence of signature analysis is discussed. 相似文献

13.

窗式WDT和容错技术在工控微机系统中的应用

蒋健《微电子学与计算机》1994,11(4):35-37

本文介绍了一种具有带通特性的定时监视器，当微机发出的脉冲间隔过长或太短时它就会发出复位或中断信号。本文还讨论了程序块划分和查错方法，除了对运行时间核查之外，每个模块还对结果的合理性进行检查，并通过在模块的入口设立标志以及在出口进行比对来检查模块跨界运行的错误。相似文献

14.

Opcode vector: An efficient scheme to detect soft errors in instructions

《Microelectronics Reliability》2018

Bit flips on instructions may affect the execution of the processor depending on the Instruction Set Architecture (ISA) and the location of the flipped bits. Intrinsically, ISAs may detect bit upsets if the errors on the instructions produce exceptions that halt the execution. In this paper, we explore a dynamic checking of the instructions to detect errors before execution. The scheme is based on loading an approximate representation of the instructions based on a vector that identifies the opcodes used in the program in a special purpose register. During execution, instructions are first checked on the register and on a negative an error is detected as the instruction has an opcode that does not correspond to any of the ones used in the program. Since we use an approximate representation, a small number of false positives can occur for erroneous instructions which may still be detected if they lead to a system crash. The proposed opcode vector scheme is compared with the use of a Bloom filter (BF) that has been previously proposed to detect errors on instructions. In both cases, a check can produce false positives but not false negatives. The Bloom filter is built using all the bits in the instruction. On the other hand, the opcode vector uses only a few bits of the instruction. In both cases, the check is combined with a previous error propagation scheme. In the opcode case, this ensures that all errors corrupt the opcode bits while for the BF, the error propagation reduces the number of false positives. The proposed approach has two main benefits. The first one is an increase in the error detection rate as the set of valid instructions is restricted to those used in the program allowing the detection of invalid instructions even if they do not lead to a system crash. The second one is that errors are detected before the crash. This is done at the cost of adding a small register for the vector of opcodes and some control logic. This is significantly simpler than in the case of the BF that needs to compute several hash functions and access several bits on the register to perform the check. We evaluated this approach on binary files of the ARM Cortex M0 core. According to our findings, the proposed vector of opcodes is more effective to detect errors than the BF and its detection rate is less dependent on the program size. Based on those results, it seems that the proposed method can be an interesting option to detect errors in instructions for systems on which a small overhead can be introduced if it improves reliability. 相似文献

15.

Experiments with low-level speculative computation based onmultiple branch prediction

Holtmann U. Ernst R. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1993,1(3):262-267

Coprocessor design is one application of high-level synthesis. We want to focus on high-performance coprocessors to speed up time critical parts in hardware-software codesign of embedded controllers. Time critical software parts often contain nested loops, often with data-dependent branches and data-dependent number of iterations. When (loop) pipelining is employed for high performance, the control dependencies become a dominant limitation to pipeline utilization. Branch prediction is a possible approach, but is usually restricted to few instructions and to one branch because of hardware and control overhead. Multiple branch prediction and speculative computation take a more global view on the program flow. We give practical examples of how speculative computation with multiple branch prediction increases performance far beyond a usual ASAP scheduling based on a CDFG. For scheduling, speculative computation requires a modification of the CDFG and, for the allocation phase, the insertion of register sets to save the processor status. The controller needs slight modification. We conclude that manual application of our approach will in general be too difficult, such that it can only be used in connection with synthesis 相似文献

16.

Error Detection Enhancement in PowerPC Architecture-based Embedded Processors

Mahdi Fazeli Reza Farivar Seyed Ghassem Miremadi 《Journal of Electronic Testing》2008,24(1-3):21-33

This paper presents a behavior-based error detection technique called Control Flow Checking using Branch Trace Exceptions for PowerPC processors family (CFCBTE). This technique is based on the branch trace exception feature available in the PowerPC processors family for debugging purposes. This technique traces the target addresses of program branches at run-time and compares them with reference target addresses to detect possible violations caused by transient faults. The reference target addresses are derived by a preprocessor from the source program. To enhance the error detection coverage, three other mechanisms, i.e., Machine Check Exception, System Trap Instructions and Work Load Timer are combined with the Branch Trace Exception mechanism. The proposed technique is experimentally evaluated on a 32-bit PowerPC microcontroller using software implemented fault injection (SWIFI) and Power Supply Disturbances fault injection (PSD). A total of 6,000 faults were injected in microcontroller to measure the error detection coverage of the proposed control flow checking technique. The experimental results show that this technique detects about 95.2% of transient errors in software implemented fault injection method and 96.4% of transient errors in power supply disturbances fault injection method. 相似文献

17.

单片机系统实用抗干扰设计 总被引：4，自引：0，他引：4

王燕芳宋辉《电子工程师》2006,32(7):49-51

随着单片机控制技术的迅速发展,单片机被广泛地应用于各个领域,由其组建的控制系统的抗干扰能力也越来越引起人们的高度重视。文中针对单片机控制系统中实际存在的干扰,从应用实践出发,主要针对硬件中的电源、布线、印制电路板等硬件,以及软件中的指令冗余、软件陷阱、程序监视定时器进行分析,完成相应硬件和软件系统的实用抗干扰措施的设计,确保系统稳定运行的可靠性和安全性。相似文献

18.

基于上芯机控制平台的多线程解释器

戚其丰仇烨胡跃明《电子工艺技术》2007,28(1):41-44

介绍了一种基于上芯机控制平台的多线程解释器技术.在分析了该通用平台的指令和性能需求的基础上,讨论了多线程解释器的功能和性能需求,然后在对解释器总体设计方面的研究中,分析了解释器的整体结构,分析了解释器相关的各项关键技术、词法分析、语法分析、解释执行,结合编译技术对基于上芯机控制平台的多线程解释器做了概要设计,进而提出了多线程解释的方案.详细介绍了这一方案的核心设计方法及其技术实现细节. 相似文献

19.

射线传感器在蔗丝厚度自动调节系统中的应用

韦以明《现代电子技术》2008,31(5):187-189

介绍了制糖企业甘蔗丝均衡传输自动调节系统的基本工作原理。给出了系统的硬件组成和控制软件主程序的流程图,系统以微机为控制中心,利用射线传感器检测蔗层厚度,通过控制输蔗机的速度实现了蔗丝流量均衡的自动调节,并利用微机监控程序完成了输蔗的计量和控制参数的在线设置。系统运行结果表明,其动态计量误差≤1%。相似文献

20.

网络视频教学支撑系统的设计与实现

胡青松盛文燕钱建生李世银《电视技术》2004,(8):65-66,74

分析了教学环境中视频传输的特点,比较研究了纯软件、纯硬件和软硬件结合等当前的主要几种开发方案,在此基础上采用软硬件结合的方式设计实现了局域网视频教学支撑系统,并介绍了系统实现的关键技术--视频编码、视频传输和视频传输的控制技术.使用情况表明,该系统能够满足日常的教学需要. 相似文献