Application-Level Fault Tolerance as a Complement to System-Level Fault Tolerance |
| |
Authors: | Haines Joshua Lakamraju Vijay Koren Israel Krishna C Mani |
| |
Affiliation: | (1) Electrical and Computer Engineering Dept., University of Massachusetts, Amherst, MA, 01003;(2) Electrical and Computer Engineering Dept., University of Massachusetts, Amherst, MA, 01003;(3) Electrical and Computer Engineering Dept., University of Massachusetts, Amherst, MA, 01003;(4) Electrical and Computer Engineering Dept., University of Massachusetts, Amherst, MA, 01003 |
| |
Abstract: | As multiprocessor systems become more complex, their reliability will need to increase as well. In this paper we propose a novel technique which is applicable to a wide variety of distributed real-time systems, especially those exhibiting data parallelism. System-level fault tolerance involves reliability techniques incorporated within the system hardware and software whereas application-level fault tolerance involves reliability techniques incorporated within the application software. We assert that, for high reliability, a combination of system-level fault tolerance and application-level fault tolerance works best. In many systems, application-level fault tolerance can be used to bridge the gap when system-level fault tolerance alone does not provide the required reliability. We exemplify this with the RTHT target tracking benchmark and the ABF beamforming benchmark. |
| |
Keywords: | distributed real-time systems fault tolerance checkpointing imprecise computation target tracking beam forming |
本文献已被 SpringerLink 等数据库收录! |
|