共查询到20条相似文献,搜索用时 421 毫秒
1.
Software-implemented EDAC protection against SEUs 总被引:1,自引:0,他引:1
In many computer systems, the contents of memory are protected by an error detection and correction (EDAC) code. Bit-flips caused by single event upsets (SEU) are a well-known problem in memory chips; EDAC codes have been an effective solution to this problem. These codes are usually implemented in hardware using extra memory bits and encoding/decoding circuitry. In systems where EDAC hardware is not available, the reliability of the system can be improved by providing protection through software. Codes and techniques that can be used for software implementation of EDAC are discussed and compared. The implementation requirements and issues are discussed, and some solutions are presented. The paper discusses in detail how system-level and chip-level structures relate to multiple error correction. A simple solution is presented to make the EDAC scheme independent of these structures. The technique in this paper was implemented and used effectively in an actual space experiment. We have observed that SEU corrupt the operating system or programs of a computer system that does not have any EDAC for memory, forcing the system to be reset frequently. Protecting the entire memory (code and data) might not be practical in software. However this paper demonstrates that software-implemented EDAC is a low-cost solution that provides protection for code segments and can appreciably enhance the system availability in a low-radiation space environment 相似文献
2.
3.
基于信息冗余的错误检测与纠正(Error Detection and Correction,EDAC)技术是常见的系统级抗单粒子翻转(Single Event Upsets,SEU)的容错方法,软件实现的EDAC技术是硬件EDAC技术的替代方案,通过软件编程,在现有存储段上增加具有纠错功能的编码(Error-correcting Codes,ECC)来实现存储区错误的检测和纠正。分析了软件EDAC方案中,纠错编码的纠错能力及编码效率、刷新间隔、需保护代码量等因素对可靠性的影响,分析和仿真实验结果表明,对于单个粒子引起的存储器随机错误,提高单个码字的纠错能力及编码效率、增大刷新间隔对可靠性的影响不大,而通过缩短任务执行的代码量来提高刷新间隔,以及压缩需保护代码的总量,对可靠性有较大改进。分析结论能够指导工程实践中,在实现资源、实时性、可靠性之间进行优化选择。 相似文献
4.
A method for design of built-in testable (BIT) error detection and correction (EDAC) circuits is presented that uses up to 65% less test hardware than customary BIT implementations. A 1-μm CMOS, 16-bit EDAC designed and fabricated with this technique exhibits >99% fault coverage in 10 μs at 25 MHz. Built-in test impacts the speed performance by only one gate delay regardless of the size of the EDAC. Various faults are injected into the chip to verify the effectiveness of built-in test 相似文献
5.
6.
An optimization method of error detection and correction (EDAC) circuit design is proposed. The method involves selecting or constructing EDAC codes of low cost hardware, associated with operation scheduling implementation based on 2-input XOR gates structure, and two actions for reducing hardware cells, which can reduce the delay penalties and area costs of the EDAC circuit effectively. The 32-bit EDAC circuit hardware implementation is selected to make a prototype, based on the 180 nm process. The delay penalties and area costs of the EDAC circuit are evaluated. Results show that the time penalty and area cost of the EDAC circuitries are affected with different parity-check matrices and different hardware implementation for the EDAC codes with the same capability of correction and detection code. This method can be used as a guide for low-cost radiation-hardened microprocessor EDAC circuit design and for more advanced technologies. 相似文献
7.
在空间辐射环境下,存储单元对单粒子翻转的敏感性日益增强。通过比较SRAM的单粒子翻转效应相关加固技术,在传统EDAC技术的基础上,增加少量硬件模块,有效利用双端口SRAM的端口资源,提出了一种新的周期可控定时刷新机制,实现了对存储单元数据的周期性纠错检错。对加固SRAM单元进行分析和仿真,结果表明,在保证存储单元数据被正常存取的前提下,定时刷新机制的引入很大程度地降低了单粒子翻转引起的错误累积效应,有效降低了SRAM出现软错误的概率。 相似文献
8.
M. Arévalo-Garbayo M. Portela-García M. García-Valderas C. López-Ongil L. Entrena 《Microelectronics Journal》2014
This paper proposes the use of an FPGA-based fault injection technique, AMUSE, to study the effect of malicious attacks on cryptographic circuits. Originally, AMUSE was devised to analyze the soft error effects (SEU and SET) in digital circuits. However, many of the fault-based attacks used in cryptanalysis produce faults that can be modeled as bit-flip in memory elements or transient pulses in combinational logic, as in faults due to radiation effects. Experimental results provide information that allows the cryptographic circuit designer to detect the weakest areas in order to implement countermeasures at design stage. 相似文献
9.
缩短汉明码及其改进码字被广泛使用在宇航级高可靠性存储器的差错检测与纠正电路中。作为一种成熟的纠正单个错误编码,其单字节内多位翻转导致缩短汉明码失效的研究却很少。这篇文章分析了单字节多位翻转导致缩短汉明码失效的情况,分析了各种可能的错误输出模式,并从理论上给出了其概率计算公式。采用Matlab软件进行的计算机模拟试验表明,理论结果与试验结果基本相符。这篇文章最后分析了ISSI公司在其抗辐射SRAM设计中采用的一种将较长信息位分成相等两部分,分别采用缩短汉明码进行编译码的方案。分析表明,这种编译码方案可以降低失效状态下输出3 bit翻转的概率。 相似文献
10.
11.
The authors propose a test algorithm for pattern-sensitive faults in large-size RAM with high circuit density. The algorithm tests an n -bit RAM in 195√n time to detect both static and dynamic pattern-sensitive faults over the 9-neighbourhood of every memory cell. A 4 Mb RAM can be tested by the proposed algorithm several thousand times faster than the conventional sequential algorithms for detecting pattern-sensitive faults. The test speedup has been achieved by writing a test data simultaneously over many cells, and the stored data are tested simultaneously by a parallel comparator and error detector in a read operation. The existing RAM architecture has been modified very little so that the proposed technique can be implemented very easily even in switched-capacitor DRAM (dynamic random-access memory) with low intercell pitch width. The test procedure has also been applied to built-in self-testing (BIST) and is compared with other BIST implementations 相似文献
12.
A fault-tolerant memory design uses modular bit swapping to achieve high system availability with minimum redundancy despite high memory-device failure rates. The design permits automatic repair of multiple faults without loss of error detection, thereby allowing deferral of manual repair. Although the design was directed toward use in a duplex system, the technique potentially applies to simplex systems. Double-bit swapping and 4096-word modules were chosen for this system. With a 64k memory, 18 memory device faults would occur in its 40-year life. The number of instances of manual repair will average 3; the number of faults in the system when manual repair is required will average 6. Similarly, with a 1024k memory, 288 memory faults would occur in its 40-year life. The number of instances of manual repair will average 20 and the number of faults in the system when manual repair is required will average 14. 相似文献
13.
P. Reyes P. Reviriego J. A. Maestro O. Ruano 《Journal of Signal Processing Systems》2008,52(3):231-247
The study of Multiple Soft errors on memory modules caused by radiation effects represents an interesting field of current
research. The fault tolerance of these devices in radiation environments is traditionally analyzed and increased by means
of soft error protection mechanisms as EDAC codes or physical interleaving. As Communication System interleavers are mainly
implemented using memories, a similar protection against soft errors to the one used for memory devices could be performed,
as a conventional solution, when they are used in critical missions. In this paper, the knowledge of the system is used to
apply the communication interleaving pattern as physical interleaving employing the inherent redundancy (added by previous
modules of the Communication System) of the data processed by the interleaver as an error correction mechanism. Therefore
a similar protection to the conventional solutions is obtained but with a reduced cost.
相似文献
O. RuanoEmail: |
14.
15.
Ionizing radiation and electromagnetic interference (EMI) can cause single event upset (SEU) in memory elements. This threat is one of the major concerns when considering the design of electronic systems for critical applications. Single Error Correction - Double Error Detection (SEC-DED) codes can be used to avoid data corruption caused by soft errors, protecting the memory against single errors. However, the presence of multiple bit upsets is becoming more frequent as technology scales down. Hereafter, we present an Error Detection and Correction (EDAC) approach, namely Parity per Byte and Duplication (PBD), to protect data stored in memory. The technique was described in VHDL, coupled with the LEON3 softcore processor, and mapped into a commercial FPGA. The obtained results have shown that the proposed approach is very effective to detect and correct multiple bit-flips in memory arrays. 相似文献
16.
The radiation sensitivity of integrated memory cells increases dramatically as the supply voltage decreases. Although there are some Error Correcting Code (ECC) studies to prevent faults on memories used in space applications, there is no consensus on choosing the best ECC product-type with two-dimensional Hamming to mitigate data faults in memory. This work introduces the Product Code for Space Applications (PCoSA), an ECC product based on Hamming and parity in both rows and columns for use in memory with space-application reliability requirements. The potentialities of PCoSA were evaluated by injecting (i) thirty-six error patterns already available in the literature and (ii) all possible combinations of up to seven bitflips. PCoSA has corrected all cases of the thirty-six error patterns, and it has a correction rate of 100% for any three bitflips, 82.67% for four bitflips, and 69.7% for five bitflips. 相似文献
17.
For the processor working in the radiation environment in space, it tends to suffer from the single event effect on circuits and system failures, due to cosmic rays and high energy particle radiation. Therefore, the reliability of the processor has become an increasingly serious issue. The BCH-based error correction code can correct multi-bit errors, but it introduces large latency overhead. This paper proposes a hybrid error correction approach that combines BCH and EDAC to correct both multi-bit and single-bit errors for caches with low cost. The proposed technique can correct up to four-bit error, and correct single-bit error in one cycle. Evaluation results show that, the proposed hybrid error-correction scheme can improve the performance of cache accesses up to 20% compared to the pure BCH scheme. 相似文献
18.
针对空间辐照环境,设计了一款基于FPGA平台抗辐照加固嵌入式系统。通过对存储单元进行三模冗余设计和(12,8)汉明码EDAC编码设计进行加固。对MC8051 IP核、I2C IP核、判决器,EDAC编码解码器等模块进行部分动态可重构设计。使用ICAP接口进行回读对比和动态可重构操作。系统配置后,定时对其进行回读对比。当检测到FPGA发生单粒子翻转时,采用部分重配置消除单粒子影响,使系统恢复正常。 相似文献
19.
Mostafa Kishani Hamid R. Zarandi Hossein Pedram Alireza Tajary Mohsen Raji Behnam Ghavami 《Design Automation for Embedded Systems》2011,15(3-4):289-310
This paper presents a high level error detection and correction method called HVD code to tolerate multiple bit upsets (MBUs) occurred in memory cells. The proposed method uses parity codes in four directions in a data part to assure the reliability of memories. The proposed method is very powerful in error detection while its error correction coverage is also acceptable considering its low computing latency. HVD code is useful for applications whose high error detection coverage is very important such as memory systems. Of course, this code can be used in combination with other protection codes which have high correction coverage and low detection coverage. The proposed method is evaluated using more than one billion multiple fault injection experiments. Multiple bit flips were randomly injected in different segments of a memory system and the fault detection and correction coverages are calculated. Results show that 100% of the injected faults can be detected. We proved that, this method can correct up to three bit upsets. Some hardware implementation issues are investigated to show tradeoffs between different implementation parameters of HVD method. 相似文献
20.
Qingyu Chen Li Chen Haibin Wang Longsheng Wu Yuanqing Li Xing Zhao Mo Chen 《Journal of Electronic Testing》2016,32(6):695-703
Bit faults induced by single-event upsets in instruction may not cause a system to experience an error. The instruction vulnerability factor (IVF) is first defined to quantify the effect of non-effective upsets on program reliability in this paper; and the mean time to failure (MTTF) model of program memory is then derived based on IVF. Further analysis of MTTF model concludes that the MTTF of program memory using error correcting code (ECC) and scrubbing is not always better than unhardened program memory. The constraints that should be met upon utilizing ECC and scrubbing in program memory are presented for the first time, to the best of authors’ knowledge. Additionally, the proposed models and conclusions are validated by Monte Carlo simulations in MATLAB. These results show that the proposed models have a good accuracy and their margin of error is less than 3 % compared with MATLAB simulation results. It should be highlighted that our conclusions may be used to contribute to selecting the optimal fault-tolerant technique to harden the program memory. 相似文献