排序方式: 共有28条查询结果,搜索用时 15 毫秒
11.
12.
13.
GPGPUs are increasingly being used to as performance accelerators for HPC (High Performance Computing) applications in CPU/GPU heterogeneous computing systems, including TianHe-1A, the world’s fastest supercomputer in the TOP500 list, built at NUDT (National University of Defense Technology) last year. However, despite their performance advantages, GPGPUs do not provide built-in fault-tolerant mechanisms to offer reliability guarantees required by many HPC applications. By analyzing the SIMT (single-instruction, multiple-thread) characteristics of programs running on GPGPUs, we have developed PartialRC, a new checkpoint-based compiler-directed partial recomputing method, for achieving efficient fault recovery by leveraging the phenomenal computing power of GPGPUs. In this paper, we introduce our PartialRC method that recovers from errors detected in a code region by partially re-computing the region, describe a checkpoint-based faulttolerance framework developed on PartialRC, and discuss an implementation on the CUDA platform. Validation using a range of representative CUDA programs on NVIDIA GPGPUs against FullRC (a traditional full-recomputing Checkpoint-Rollback-Restart fault recovery method for CPUs) shows that PartialRC reduces significantly the fault recovery overheads incurred by FullRC, by 73.5% when errors occur earlier during execution and 74.6% when errors occur later on average. In addition, PartialRC also reduces error detection overheads incurred by FullRC during fault recovery while incurring negligible performance overheads when no fault happens. 相似文献
14.
近年来,为了缓解日益严重的功耗问题,异构并行体系结构已成为超级计算机发展的一个重要趋势.图形处理器(graphics processing unit,简称GPU)凭借其超高的计算性能和性能功耗比,作为一种高效的加速部件已被广泛应用于高性能计算领域.但是,GPU先天的可靠性缺陷势必加剧超级计算机的可靠性问题.目前,国际上关于CPU-GPU异构系统容错技术的研究工作主要将GPU从异构系统中独立出来,以每次调用为粒度对其进行容错处理.设计了一种面向CPU-GPU异构系统的Lazy容错方法,给出了基于编译指导命令的容错框架及其约束,并讨论了相关的编译实现和优化方法,最后通过实验验证了该方法的正确性.实验结果表明,与现有的容错方法相比,利用所设计的LazyFT容错方法对GPGPU(general purpose computation on graphics hardware)程序进行容错处理,可以明显降低容错代价. 相似文献
15.
在我国华南地区模具加工业发达,中高档高速高精度的铣床、加工中心应用广泛。然而在实际生产车间中,这些机床的数控系统大多数是西门子、FANUC、三菱等国外公司的产品,几乎看不到国内生产的数控系统。 相似文献
16.
17.
半导体工艺的发展使得芯片上集成的晶体管数目不断增加,图形处理器的存储和计算能力也越来越强大。目前,GPU的峰值运算能力已经远远超出主流的CPU,它在非图形计算领域,特别是高性能计算领域的潜力已经引起越来越多研究者的关注。本文介绍了GPU用于通用计算的原理以及目前学术界和产业界关于GPGPU体系结构和编程模型方面的最新研究成果。 相似文献
18.
19.
Jacobi和Laplace算法在GPU平台上的设计与实现 总被引:1,自引:1,他引:0
随着半导体工艺的发展,GPU的浮点计算能力迅速提高。如何把GPU处理技术应用到非图形计算领域成为体系结构以及高性能计算领域的热点研究问题。Jacobi和Laplace是科学计算领域常用的计算核心。本文基于AMD的流处理GPU平台设计并实现了这两个算法,相对于CPU平台取得了很好的加速效果。 相似文献
20.