首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到16条相似文献,搜索用时 156 毫秒
1.
面向软错误的寄存器活跃区间分析   总被引:1,自引:0,他引:1  
继性能和功耗问题之后,软错误导致的计算可信性已成为一个日益严峻的课题.由于寄存器访问频繁却未能被良好保护,发生在其中的软错误成为影响系统可靠性的关键因素之一.基于程序汇编代码,提出一种针对寄存器软错误的程序可靠性静态分析方法.首先通过数据流分析技术提取出可能影响程序执行的寄存器活跃区间,然后基于构成活跃区间的基本块集合计算其有效体系结构易感位数,在此基础上可定量计算寄存器软错误影响下的程序可靠性.基于MiBench基准程序的实验表明,该方法的分析结果与AVF分析法保持一致,同时还指出了寄存器相关活跃区间的关键程度,这为实现针对寄存器软错误的高效容错方法提供了依据.  相似文献   

2.
针对寄存器交换方法在降低寄存器软错误率过程中,未考虑寄存器分配过程对软错误所带来影响的问题,提出一种基于活跃变量对于软错误影响的静态寄存器重分配方法。首先,引入活跃变量权值来评估其对寄存器软错误的影响;然后,提出两条规则,在进行寄存器交换后对活跃变量进行寄存器的重新分配。该方法在更小粒度的活跃变量层次,进一步降低了寄存器软错误率。实验和分析表明,相对于寄存器交换方法,该策略能进一步降低30%的寄存器软错误率,增强了寄存器的可靠性。  相似文献   

3.
基于软件实现的软错误容错方法不需要硬件开销,被认为是一种高效的软错误容错方法,而动态的实现这种方法能覆盖更多种类型的程序,因而能覆盖更多的软错误,分析硬件软错误对程序执行时代码和数据的逻辑影响,并建立了硬件软错误条件下程序运行可靠性评估模型.本文的工作为基于软件动态软错误容错算法的提出提供了理论基础,也为程序可靠性的评估提供了一种方法.我们依据体系结构层硬件对指令执行的影响将硬件构件进行分类,并分析了不同的硬件构件对程序代码和数据的逻辑影响.基于软错误对程序代码和数据的影响模型,建立了软错误条件下程序运行可靠性评估模型.最后,在实验中,对软错误条件下程序影响模型和程序运行可靠性评估模型进行了验证,实验结果证明了本文的分析和评估结果.  相似文献   

4.
故障注入是研究软错误故障传播的传统手段,但随着程序复杂性不断增加,采用故障注入对大量软错误的故障传播进行研究将花费巨大的时间成本。提出一种基于程序动态指令进行分析和建模从而快速获取软错误结果的方法。将程序转化为动态指令序列,通过体系结构正确执行分析将所有可能的软错误划分为对程序运行结果有影响和没有影响两部分;基于动态依赖图建立软错误故障传播分析模型,并建立判断程序崩溃的标准,进而提出一个算法对任意制定的能够影响程序运行结果的软错误进行故障传播分析并重点预测程序崩溃的发生。实验显示,预测的漏报率和分析单个软错误的平均用时明显低于现有方法。  相似文献   

5.
随着集成电路的发展,逻辑电路对放射性粒子引起的软错误越来越敏感.现有的电路加固技术通常会带来较大的面积开销.综合考虑电路的软错误率和面积开销,提出一种新的电路加固评估指标FAP,并提出基于贪婪算法的寄存器替换技术,通过将电路的部分敏感寄存器替换为冗余寄存器来免疫电路中的软错误.针对贪婪算法有时不能达到可靠性和开销整体最优的局限,进一步提出可靠性-开销最优的启发式替换算法.实验结果表明,基于贪婪算法的寄存器替换技术只需50%的面积开销就可降低90%的电路软错误率;而可靠性-开销最优的启发式替换算法只需45%左右的面积开销,电路软错误率就降低达90%以上.与其他已有技术相比,电路软错误免疫技术在可靠性和面积开销间达到了更好的折中.  相似文献   

6.
在实时系统的应用中常常需要对系统的执行时间,尤其是最坏执行时间进行分析。而程序中的循环结构的迭代次数对程序执行时间的分析结果具有重要的影响。程序的循环边界分析目的在于给出较为接近程序真实运行情况下的循环结构迭代的上界和下界。提出了一种基于抽象解释理论的程序循环边界计算方法,该方法对原有的循环边界分析方法进行了改进。首先在程序切片阶段对原程序建立程序依赖图,并提出了对程序依赖图的约简方法。由约简后的依赖关系可以对变量的取值进行约束,得到更小的取值范围,因此基于该方法的循环边界分析结果更加接近程序的实际执行边界,对获取精确的程序执行时间具有重要意义。  相似文献   

7.
基于程序谱的错误定位技术由于其较高的定位效率已成为当前软件调试领域研究热点之一.这种技术通常根据测试覆盖信息计算程序语句发生错误的可疑度来进行错误定位.然而,这种技术会随着程序中错误数目的增多效率不断下降.鉴于此,提出了一种基于条件执行切片谱的多错误定位技术(conditioned execution slicing spectrum-based multiple fault localization,CESS-MFL),以提高多错误定位的效率.CESS-MFL技术首先根据输入变量的谓词条件构建错误相关条件执行切片的谱矩阵,然后依次计算错误相关条件执行切片中的元素(语句或语句块)的可疑度,并生成可疑度报告.实验验证了CESS-MFL技术比当前流行的基于程序谱的Tarantula技术、基于程序切片的Intersection技术、Union技术有更高的多错误定位效率,并且可在有效的时间和空间复杂度内完成.  相似文献   

8.
马骏驰  汪芸 《软件学报》2016,27(2):219-230
软错误是高辐照空间环境下影响计算可靠性的主要因素,结果错误(silent data corruption,简称SDC)是软错误造成的一种特殊的故障类型.针对SDC难以检测的问题,提出了一种基于不变量的检测方法.不变量是运行时刻保持不变的程序特征.在软错误发生后,由于程序受到影响,不变量一般不再满足.根据该原理,在源代码中插入以不变量为内容的断言,利用发生软错误后断言报错来检测软错误.首先,根据错误传播分析确定了检测位置,提取了检测位置的不变量;定义了表征不变量检测能力的渗透率,在同一检测位置依据渗透率将不变量转化为断言.通过错误注入实验,验证了该检测方法的有效性.实验结果表明:该检测方法具备较高的检出率和较低的检测代价,为星载系统的软错误防护提供了新的解决思路.  相似文献   

9.
为了达到更准确、更高效的程序时间复杂度,解决复杂度分析中的循环下的复杂情况,如多个跳出点、嵌套循环和非数值域循环等,提出了基于执行序列计算复杂度的方法。提取出程序方法的各条可能的执行序列及其各条执行序列的相关约束条件和执行效应,在此基础上分析出序列间的关系从而计算出最终的时间复杂度。基于这种方法开发出的工具,通过几个大型的实际程序,发现这种方法可以有效地计算出其中大于90%的方法的运行复杂度。  相似文献   

10.
王曙燕 《计算机应用研究》2021,38(5):1487-1490,1497
针对基于程序谱错误定位方法完全依赖于测试用例的语句覆盖信息导致错误定位效率低下的问题,提出了一种基于变异测试技术的程序谱错误定位方法。在原有语句怀疑度计算方法的基础上,增加了程序变异后执行结果与原程序执行结果不同的测试用例变化情况的分析。此外,为解决程序变异后产生的变异体数量巨大而导致执行代价过大的问题,提出了根据变异位置约简变异体的策略。实验结果表明,与几种基于程序谱的程序错误定位方法相比,该方法的错误定位代价最低,能有效提高错误定位的效率。  相似文献   

11.
Just-in-Time (JIT) compilation is frequently employed in order to speed-up the execution of platform-independent and dynamically extensible mobile code applications. Since the time required for dynamic compilation directly influences a program's execution time, JIT compilers usually utilize only simple and fast techniques for program analysis and optimization. To improve further the analysis and optimization process of such compilers program annotations can be used.However, mostly all current annotation approaches suffer from the fact that the verification of transmitted program information is time consuming and therefore will not be carried out on the consumer side of a mobile code system. In this paper, we present a verifiable annotation technique that is based on a well known iterative data flow algorithm and which can be used for the transmission of all program information that can be derived through data flow analysis. Preliminary measurements of compilation and verification time indicate that the presented technique seems to be implementable and therefore could be used as an all-purpose transportation technique for safe program annotations.  相似文献   

12.
A control flow checking scheme capable of detecting control flow errors of programs resulting from software coding errors, hardware malfunctions, or memory mutilation during the execution of the program is presented. In this approach, the program is partitioned into loop-free intervals and a database containing the path information in each of the loop-free intervals is derived from the detailed design. The path in each loop-free interval actually traversed at run time is recorded and then checked against the information provided in the database, and any discrepancy indicates an error. This approach is general, and can detect all uncompensated illegal branches. Any uncompensated error that occurs during the execution of a loop-free interval and manifests itself as a wrong branch within the loop-free interval or right after the completion of execution of the loop-free interval is also detectable. The approach can also be used to check the control flow in the testing phase of program development. The capabilities, limitations, implementation, and the overhead of using this approach are discussed.  相似文献   

13.
杨学军  高珑 《软件学报》2007,18(4):808-820
无论是可靠性工程还是软件可靠性中的可靠性模型,都难以描述硬件故障在程序中的传播问题.首先建立了计算数据流模型,并以无穷存储机器的指令集为例,说明可以为任意程序建立计算数据流图.在计算数据流模型的基础上,进一步建立了错误流模型.把计算过程中的错误分成物理错误和传播错误两种,通过分析这两种错误的本质和传播规律,给出了6条有关错误传播的规则和2条独立定律.根据这些规则和定律,能够计算出在程序运行过程中,任意时刻在任意位置上出现错误的概率.最后以一个简单的无穷存储机器程序为例,简要地展示了错误流模型描述硬件故障在  相似文献   

14.
Cache structures in a multicore system are highly vulnerable to soft errors. Enabling fault tolerance capabilities on all cache structures in a system is inefficient in terms of performance and power consumption. In this study, we propose an enhanced protection mechanism for code segments, which are critical in terms of reliability, by utilizing asymmetrically reliable cores under performance and power constraints. Our proposed system contains at least one high-reliability core, which has an ECC-protected L1 cache, and several low-reliability cores, which have no protection mechanisms. Reliability-based critical code regions are assumed to be high-priority functions, which are extracted by examining the execution time percentages and the program’s call graph in our framework, statically. Software threads that invoke one of the high-priority functions are bound to the high-reliability cores dynamically during the execution, while the threads that execute the remaining functions are bound to the low-reliability cores. As part of the experimental analysis, our proposed framework is compared with traditional fully protected and unprotected configurations with respect to performance, power and reliability metrics for various applications of the benchmarks. Our framework exploits the benefits of providing the reliability-based critical regions of the applications exclusively by offering notable power and cost savings with close performance and reliability values for the set of functions reported in the experimental results.  相似文献   

15.
随着工艺的进步,微处理器将面临越来越严重的软错误威胁.文中提出了两种片上多核处理器容软错误执行模型:双核冗余执行模型DCR和三核冗余执行模型TCR.DCR在两个冗余的内核上以一定的时间间距运行两份相同的线程,store指令只有在进行了结果比较以后才能提交.每个内核增加了硬件实现的现场保存与恢复机制,以实现对软错误的恢复.文中选择的现场保存点有利于隐藏现场保存带来的时间开销,并且采用了特殊的机制保证恢复执行和原始执行过程中load数据的一致性.TCR执行模型通过在3个不同的内核上运行相同的线程实现对软错误的屏蔽.在检测到软错误以后,TCR可以进行动态重构,屏蔽被软错误破坏的内核.实验结果表明,与传统的软错误恢复执行模型CRTR相比,DCR和TCR对核间通信带宽的需求分别降低了57.5%和54.2%.在检测到软错误的情况下,DCR的恢复执行带来5.2%的性能开销,而TCR的重构带来的性能开销为1.3%.错误注入实验表明,DCR能够恢复99.69%的软错误,而TCR实现了对SEU(Single Event Upset)型故障的全面屏蔽.  相似文献   

16.
In a typical distributed computing system (DCS), nodes consist of processing elements, memory units, shared resources, data files, and programs. For a distributed application, programs and data files are distributed among many processing elements that may exchange data and control information via communication link. The reliability of DCS can be expressed by the analysis of distributed program reliability (DPR) and distributed system reliability (DSR). In this paper, two reliability measures are introduced which are Markov-chain distributed program reliability (MDPR) and Markov-chain distributed system reliability (MDSR) to accurately model the reliability of DCS. A discrete time Markov chain with one absorbing state is constructed for this problem. The transition probability matrix is employed to represent the transition probability from one state to another state in a unit of time. In addition to mathematical method to evaluate the MDPR and MDSR, a simulation result is also presented to prove its correction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号