Self-repairing processor modules期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Self-repairing processor modules

Authors:	Kilmer WL

Affiliation:	Massachusetts Univ., Amherst, MA;

Abstract:	A processor is any self-contained computer of at least personal-computer capability. The paper explores how much the processor mean time-to-failure can be improved by replacing it with an N-processor module, where each processor in the module consists of a copy of the original processor augmented with a communication protocol unit. The copy of the original processor is faulty with probability, p_c, and the protocol unit is faulty with probability, p. The asynchronous N-processor module uses a Byzantine agreement (F-ID-P) algorithm to identify which of its processors disagreed with a module consensus. The identified processors are presumed faulty, and the module replaces them with duplicates from a set of standbys. The F-ID-P algorithm is a modification of Bracha's, which guarantees that in a module of 3t+1 processors, up to t faults can be identified by at least t+1 non-faulty processors. The module fails if faults in more than t of its processors prevent it from: 1) obtaining a correct consensus, or 2) executing the algorithm. The F-ID-P algorithm departs from Bracha's by using a random instead of an adversary scheduler of message delays. Simulation showed that almost always F-ID-P algorithm correctly identified all of a module's faulty processors if more than half of them were nonfaulty. Thus F-ID-P algorithm was about 3/2 more fault tolerant than guaranteed. Also, compared to a single processor's mean number of decisions to failure, the F-ID-P module was 841 times better when N=37, down to 5.1 times better when N=10

Keywords:

设为首页 | 免责声明 | 关于勤云 | 加入收藏