首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
低开销容错技术是当前软错误研究领域的热点。为了对微处理器进行低开销容错保护,首先就需要对微处理器可靠性(即体系结构弱点因子AVF (Architectural Vulnerability Factor))进行准确评估。然而,现有的AVF评估工具的精确性和适用范围都受到不同程度的限制。该文以微处理器上的核心部件(即存储部件)作为研究对象,对AVF评估方法进行改进,提出了一种访存操作分析和指令分析相结合的AVF评估策略HAES (Hybrid AVF Evaluation Strategy)。该文将HAES融入到通用的模拟器中,实现了更精确和更通用的AVF评估框架。实验结果表明相比其它AVF评估工具,利用该文提出的评估框架得到的AVF平均降低22.6%。基于该评估框架计算得到的AVF更加精确地反映了不同应用程序运行时存储部件的可靠性,对设计人员对微处理器进行低开销的容错设计具有重要指导意义。  相似文献   

2.
Soft errors are an important issue for circuit reliability. To mitigate their effects on the system functionality, different techniques are used. In many cases Error Correcting Codes (ECC) are used to protect circuits. Single Error Correction (SEC) codes are commonly used in memories and can effectively remove errors as long as there is only one error per word. Soft errors however may also affect the circuits that implement the Error Correcting Codes: the encoder and the decoder. In this paper, the protection against soft errors in the ECC encoder is studied and an efficient fault tolerant implementation is proposed.  相似文献   

3.
As the microelectronics technology continuously advances to deep submicron scales, the occurrence of Multiple Cell Upset (MCU) induced by radiation in memory devices becomes more likely to happen. The implementation of a robust Error Correction Code (ECC) is a suitable solution. However, the more complex an ECC, the more delay, area usage and energy consumption. An ECC with an appropriate balance between error coverage and computational cost is essential for applications where fault tolerance is heavily needed, and the energy resources are scarce. This paper describes the conception, implementation, and evaluation of Column-Line-Code (CLC), a novel algorithm for the detection and correction of MCU in memory devices, which combines extended Hamming code and parity bits. Besides, this paper evaluates the variation of the 2D CLC schemes and proposes an additional operation to correct more MCU patterns called extended mode. We compared the implementation cost, reliability level, detection/correction rate and the mean time to failure among the CLC versions and other correction codes, proving the CLCs have high MCU correction efficacy with reduced area, power and delay costs.  相似文献   

4.
Error correction codes (ECCs) are commonly used to deal with soft errors in memory applications. Typically, Single Error Correction-Double Error Detection (SEC-DED) codes are widely used due to their simplicity. However, the phenomenon of more than one error in the memory cells has become more serious in advanced technologies. Single Error Correction-Double Adjacent Error Correction (SEC-DAEC) codes are a good choice to protect memories against double adjacent errors that are a major multiple error pattern. An important consideration is that the ECC encoder and decoder circuits can also be affected by soft errors, which will corrupt the memory data. In this paper, a method to design fault tolerant encoders for SEC-DAEC codes is proposed. It is based on the fact that soft errors in the encoder have a similar effect to soft errors in a memory word and achieved by using logic sharing blocks for every two adjacent parity bits. In the proposed scheme, one soft error in the encoder can cause at most two errors on adjacent parity bits, thus the correctness of memory data can be ensured because those errors are correctable by the SEC-DAEC code. The proposed scheme has been implemented and the results show that it requires less circuit area and power than the encoders protected by the existing methods.  相似文献   

5.
Due to shrinking feature size and higher transistor count in a single chip in modern fabrication technologies, power consumption and soft error reliability have become two critical challenges which chip designers are facing in new silicon integrated circuits. Recent studies have shown that these issues have compromising effects on each other. Besides, power consumption and reliability significantly vary across workloads and among pieces of a single application which can be exploited to design adaptive runtime fault tolerant and low power systems. These attractions have been exploited in prior studies to design online reconfigurable fault tolerant systems with power management schemes. However, those attempts are driven by complicated simulations and hardly deliver a sense of direction to the designers. To achieve maximum efficiency in terms of power, performance, and reliability in dynamic scaling of voltage and frequency, it is critical to have a simple and accurate reliability model which estimates the value of fault rate considering supply voltage and operating frequency of a circuit. In this paper, we propose an accurate formula for analytic modeling of the soft error rate of a system which can be used to precisely track the reliability of the system under dynamic voltage and frequency adjustments. The experimental results of this paper prove that our proposed model offers precise estimates of reliability in accordance with the results of accurate soft error rate (SER) estimation algorithm for ISCAS85’s benchmark circuits.  相似文献   

6.
Soft error modeling and remediation techniques in ASIC designs   总被引:1,自引:0,他引:1  
Soft errors due to cosmic radiations are the main reliability threat during lifetime operation of digital systems. Fast and accurate estimation of soft error rate (SER) is essential in obtaining the reliability parameters of a digital system in order to balance reliability, performance, and cost of the system. Previous techniques for SER estimation are mainly based on fault injection and random simulations. In this paper, we present an analytical SER modeling technique for ASIC designs that can significantly reduce SER estimation time while achieving very high accuracy. This technique can be used for both combinational and sequential circuits. We also present an approach to obtain uncertainty bounds on estimated error propagation probability (EPP) values used in our SER modeling framework. Comparison of this method with the Monte-Carlo fault injection and simulation approach confirms the accuracy and speed-up of the presented technique for both the computed EPP values and uncertainty bounds.Based on our SER estimation framework, we also present efficient soft error hardening techniques based on selective gate resizing to maximize soft error suppression for the entire logic-level design while minimizing area and delay penalties. Experimental results confirm that these techniques are able to significantly reduce soft error rate with modest area and delay overhead.  相似文献   

7.
Selecting the ideal trade-off between reliability and cost associated with a fault tolerant architecture generally involves an extensive design space exploration. Employing state-of-the-art reliability estimation methods makes this exploration un-scalable with the design complexity. In this paper we introduce a low-cost reliability analysis methodology that helps taking this key decision with less computational effort and orders of magnitude faster. Based on this methodology we also propose a selective hardening technique using a hybrid fault tolerant architecture that allows meeting the soft-error rate constraints within a given design cost-budget and vice versa. Our experimental validation shows that the methodology offers huge gain (1200 ×) in terms of computational effort in comparison with fault injection-based reliability estimation method and produces results within acceptable error limits.  相似文献   

8.
Phase change RAM (PRAM) is a promising memory technology because of its fast read access time, very low standby power and high storage density. Multi-level Cell (MLC) PRAM, which has been introduced to further improve the storage density, comes at a price of lower reliability. This paper focuses on a cost-effective solution for improving the reliability of MLC-PRAM. As a first step, we study in detail the causes of hard and soft errors and develop error models to capture these effects. Next we propose a multi-tiered approach that spans architecture, circuit and system levels to increase the reliability. At the architecture level, we use a combination of Gray code encoding and 2-bit interleaving to partition the errors so that a lower strength error control coding (ECC) scheme can be used for half of the bits. We use subblock flipping and threshold resistance tuning to reduce the number of errors in the remaining bits. For even higher reliability, we use a simple BCH based ECC on top of these techniques. We show that the proposed multi-tiered approach enables us to use ECC with 2-error correction capability (t?=?2) instead of one with t?=?8 to achieve a block failure rate (BFR) of 10?8. We propose to use a non-iterative algorithm to implement the BCH t?=?2 decoder because of its small latency. We evaluate the latency and energy overhead of the proposed scheme using CACTI and the IPC performance using GEM5. We show that for SPEC CINT 2006 and DaCapo benchmarks, the proposed system can achieve BFR = 10?8 with 2.2 % IPC reduction and 7 % additional energy compared to a memory without any error correction capability.  相似文献   

9.
10.
With fabrication technology reaching nano levels, systems are exposed to higher susceptibility to soft errors. Thus, development of effective techniques for designing soft error tolerant systems is of high importance. In this work, an integrated soft error tolerance technique based on logical implications and transistor sizing is proposed. In order to reduce implication learning time, a set of source and target nodes with predefined thresholds are selected and implications between these nodes are extracted. Then, the impact of adding a functionally redundant wire (FRW) due to each implication is evaluated. This is done based on identifying an implication path and the gates along the implication path whose detection probabilities will be reduced due to adding the implication FRW. Then, the gain of an implication is estimated in terms of reduction in fault detection probabilities of gates along an implication path. The implication with the highest gain is selected. The process is repeated until the gain is less than a predetermined threshold. The proposed implication-based fault tolerance technique enhances the circuit reliability with minimal area overhead based on enhancing logical masking. However, its effectiveness depends on the existence of such relations in a circuit and can enhance circuit reliability upto a certain level. To enhance circuit reliability to any required level, selective-transistor redundancy (STR) based technique is then applied. This technique is based on providing fault tolerance for individual transistors with high detection probability based on transistor duplication and sizing. Experimental results show that the proposed integrated fault tolerance technique achieves similar reliability in comparison to applying STR alone with lower area overhead.  相似文献   

11.
Nano-scale digital integrated circuits are getting increasingly vulnerable to soft errors due to aggressive technology scaling. On the other hand, the impacts of process variations on characteristics of the circuits in nano era make statistical approaches as an unavoidable option for soft error rate estimation procedure. In this paper, we present a novel statistical Soft Error Rate estimation framework. The vulnerability of the circuits to soft errors is analyzed using a newly defined concept called Statistical Vulnerability Window (SVW). SVW is an inference of the necessary conditions for a Single Event Transient (SET) to cause observable errors in the given circuit. The SER is calculated using a probabilistic formulation based on the parameters of SVWs. Experimental results show that the proposed method provides considerable speedup (about 5 orders of magnitude) with less than 5 % accuracy loss when compared to Monte-Carlo SPICE simulations. In addition, the proposed framework, keeps its efficiency when considering a full spectrum charge collections (more than 36X speedups compared to the most recently published similar work).  相似文献   

12.
Radiation-induced soft errors are the major reliability threat for digital VLSI systems. In particular, field-programmable gate-array (FPGA)-based designs are more susceptible to soft errors compared to application-specific integrated circuit implementations, since soft errors in configuration bits of FPGAs result in permanent errors in the mapped design. In this paper, we present an analytical approach to estimate the soft error rate of designs mapped into FPGAs. Experimental results show that this technique is orders of magnitude faster than the fault injection method while more than 96% accurate. We also present a highly reliable and low-cost soft error mitigation technique which can significantly improve the availability of FPGA-mapped designs. Experimental results show that, using this technique, the availability of an FPGA mapped design can be increased to more than 99.99%.  相似文献   

13.
Best Effort (BE) and Guaranteed Throughput services (GT) are the two broad categories of communication services provided in NoC. Few of the existing NoC architectures provide both of these services. GT based services, which are based on circuit switching or connection oriented mechanisms of packet switching, are usually preferred for real time traffic while packet switching services are provided by the BE architecture. In this paper, biologically inspired fault tolerant techniques are implemented on these two different services. Biologically inspired techniques offer novel ways of making NoCs fault tolerant; faults in NoCs arise partly due to advanced nanoscale manufacturing processes and the complex communication requirements of the processing elements (PEs). The proposed NoCs fault-tolerant methods (synaptogenesis and sprouting) are adapted from the biological brain׳s robust fault tolerant mechanisms. These techniques are implemented on both BE and GT services. From the experimental results, the BE architecture was efficiently utilizing the bandwidth compared to GT services, while throughput utilization of GT services were better. The accepted traffic (flit/cycle/node) of the BE architecture is 6.31% better than GT architecture while the accepted traffic of the bio-inspired techniques is 72.12% better than traditional fault tolerant techniques.  相似文献   

14.
基于重构的片上网络容错机制   总被引:1,自引:0,他引:1  
为了保证片上网络的可靠性,本文提出了一种新的容错机制。在片上网络中由于路由器故障将导致与其连接的IP核不能与其他核通信,使片上网络的可靠性降低。本文的方法通过选择最优相邻的路由器来替代故障路由器,从而达到恢复IP核通信的目的。通过为每个路由器配置一个状态寄存器,用来存储相邻路由器的安全度,在路由时采用新的可重构路由算法绕过故障的路由器,以提高片上网络的可靠性。在OPNET平台上对5×52D-Mesh结构的片上网络进行仿真实验,统计了数据传输延时。试验结果表明,本文提出的路由算法与对比文献的路由算法相比,在延迟方面有明显的优势。  相似文献   

15.
In deep sub-micron ICs,growing amounts of on-die memory and scaling effects make embedded memories more vulnerable to reliability problems,such as soft errors induced by radiation.Error Correction Code (ECC) along with scrubbing is an efficient method for protecting memories against these errors.However,the latency of coding circuits brings speed penalties in high performance applications.This paper proposed a "bit bypassing" ECC protected memory by buffering the encoded data and adding an identifying address for the input data.The proposed memory design has been fabricated on a 130 nm CMOS process.According to the measurement,the proposed scheme only gives the minimum delay overhead of 22.6%,compared with other corresponding memories.Furthermore,heavy ion testing demonstrated the single event effects performance of the proposed memory achieves error rate reductions by 42.9 to 63.3 times.  相似文献   

16.
Space-time adaptive processing (STAP) for airborne early warning radar has been a very active area of research since the late 1980's. An airborne rectangular planar array antenna is usually configured into subarrays and then partial adaptive processing is applied to the outputs of these subarrays. In practice, three kinds of errors are often encountered: the array gain and phase errors existing in each element, the channel gain and phase errors, and the clutter covariance matrix estimation errors due to insufficient secondary data samples. These errors not only degrade the clutter suppression performance, but also cause the adapted array patterns to suffer much distortion (high sidelobes and distorted mainbeams), which may result in the rise of false-alarm probability and make the adaptive monopulse tracking and sidelobe blanketing more difficult. In this paper, the causes of the above three kinds of errors to array pattern distortion are discussed and a novel quadratic soft constraint factored approach is proposed to precisely control the peak sidelobe level of adapted patterns. The soft constraint factor can be determined explicitly according to the peak sidelobe level desired and the known or desired tolerant error standard deviations. Numerical results obtained by using high-fidelity simulated airborne radar clutter data are provided to illustrate the performance of the proposed approach. Although the method is presented for STAP, it can be directly applied to the conventional adaptive beamforming for rectangular planar arrays used to suppress jammers  相似文献   

17.
Memory Reliability Improvement Based on Maximized Error-Correcting Codes   总被引:1,自引:1,他引:0  
Error-correcting codes (ECC) offer an efficient way to improve the reliability and yield of memory subsystems. ECC-based protection is usually provided on a memory word basis such that the number of data-bits in a codeword corresponds to the amount of information that can be transferred during a single memory access operation. Consequently, the codeword length is not the maximum allowed by a certain check-bit number since the number of data-bits is constrained by the width of the memory data interface. This work investigates the additional error correction opportunities offered by the absence of a perfect match between the numbers of data-bits and check-bits in some widespread ECCs. A method is proposed for the selection of multi-bit errors that can be additionally corrected with a minimal impact on ECC decoder latency. These methods were applied to single-bit error correction (SEC) codes and double-bit error correction (DEC) codes. Reliability improvements are evaluated for memories in which all errors affecting the same number of bits in a codeword are independent and identically distributed. It is shown that the application of the proposed methods to conventional DEC codes can improve the mean-time-to-failure (MTTF) of memories with up to 30 %. Maximized versions of the DEC codes are also proposed in which all adjacent triple-bit errors become correctable without affecting the maximum number of triple-bit errors that can be made correctable.  相似文献   

18.
In this paper, we propose an efficient and promising soft error tolerance approach for arithmetic circuits with high performance and low area overhead. The technique is applied for designing soft error tolerant adders and is based on the use of a fault tolerant C-element connecting a given adder output to one input of the C-element while connecting a delayed version of that output to the second input. It exploits the variability of the delay of the adder output bits, in which the most significant bits (MSBs) have longer delay than the least significant bits (LSBs), by adding larger delay to the LSBs and smaller delay to the MSBs to guarantee full fault tolerance against the largest pulse width of transient error (soft error) for the available technology with minimum impact on performance. To guarantee fault protections for transistors feeding outputs with smaller added delay, the technique utilizes transistor scaling to ensure that the injected fault pulse width is less than the added delay of the second output of the C-element. Simulation results reveal that the proposed technique takes precedence over other techniques in terms of failure rate, area overhead, and delay overhead. The evaluation experiments have been done based on simulations at the transistor level using HSPICE to take care of temporal masking combined with electrical masking. In comparison to TMR, the technique achieves 100% reliability with 31% reduction in area overhead without impacting performance in the case of a 32-bit adder, and 42% reduction in area overhead and 5% reduction in performance overhead in the case of a 64-bit adder. While our proposed technique achieves area reduction of 4.95% and 9.23% in comparison to CE-based DMR and Feedback-based DMR techniques in the case of a 32-bit adder, it achieves area reduction of 19.58% and 23.24% in the case of a 64-bit adder.  相似文献   

19.
In VLSIs, soft errors resulting from radiation-induced transient pulses frequently occur. In recent high-density and low-power VLSIs, the operation of systems is seriously affected by not only soft errors occurring on memory systems and the latches of logic circuits but also those occurring on the combinational parts of logic circuits. The existing tolerant methods for soft errors on the combinational parts do not provide enough high tolerant capability with small performance penalty. This paper proposes a class of soft error masking circuits by using a Schmitt trigger circuit and a pass transistor. The paper also presents a construction of soft error masking latches (SEM-latches) capable of masking transient pulses occurring on combinational circuits. Moreover, simulation results show that the proposed method has higher soft error tolerant capability than the existing methods. For supply voltage V DD ?=?3.3 V, the proposed method is capable of masking transient pulses of magnitude 4.0 V or less.  相似文献   

20.
Glitches due to the secondary neutron particles from cosmic rays cause soft errors in integrated circuits (IC) that are becoming a major threat in modern sub 45nm ICs. Therefore, researchers have developed many techniques to mitigate the soft errors and some of them utilize the built in error detection schemes of low-power asynchronous null conventional logic (NCL). However, it requires extensive simulations and emulations for careful and complete analysis of the design, which can be costly, time consuming and cannot encompass all the possible input conditions. In this paper, we propose a framework to improve the soft error tolerant asynchronous pipelines by identifying and formally analyzing the vulnerable paths using the nuXmv model checker. The proposed framework translates the design behavior and specification into a state-space model and the potential vulnerabilities against soft errors in the pipeline as linear temporal logical (LTL) properties. These formally specified properties are then verified on the state-space model and in case of failure counterexamples are obtained. These counterexamples can then be further analyzed to obtain the soft error propagation paths and thus give insights about soft error tolerant approaches to the designers. For illustration, this work provides an analysis and comparison of three state-of-the-art asynchronous pipelines. Formal model and analysis of all the pipelines show that the soft error hardened pipeline is comparatively superior against soft errors but at the expense of almost two times area overhead.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号