期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Parallel application performance on shared high performance reconfigurable computing resources

Melissa C. Gregory D. 《Performance Evaluation》2005,60(1-4):107-125

The use of a network of shared, heterogeneous workstations each harboring a reconfigurable computing (RC) system offers high performance users an inexpensive platform for a wide range of computationally demanding problems. However, effectively using the full potential of these systems can be challenging without the knowledge of the system's performance characteristics. While some performance models exist for shared, heterogeneous workstations, none thus far account for the addition of RC systems. Our analytic performance model includes the effects of the reconfigurable device, application load imbalance, background user load, basic message passing communication, and processor heterogeneity. The methodology proves to be accurate in characterizing these effects for applications running on shared, homogeneous, and heterogeneous HPRC resources. The model error in all cases was found to be less than 5% for application runtimes greater than 30 s, and less than 15% for runtimes less than 30 s. 相似文献

2.

Towards highly available and scalable high performance clusters

Azzedine Boukerche Raed A. Al-Shaikh Mirela Sechi Moretti Annoni Notare 《Journal of Computer and System Sciences》2007,73(8):1240-1251

In recent years, we have witnessed a growing interest in high performance computing (HPC) using a cluster of workstations. This growth made it affordable to individuals to have exclusive access to their own supercomputers. However, one of the challenges in a clustered environment is to keep system failure to the minimum and to achieve the highest possible level of system availability. High-Availability (HA) computing attempts to avoid the problems of unexpected failures through active redundancy and preemptive measures. Since the price of hardware components are significantly dropping, we propose to combine both HPC and HA concepts and layout the design of a HA-HPC cluster, considering all possible measures. In particular, we explore the hardware and the management layers of the HA-HPC cluster design, as well as a more focused study on the parallel-applications layer (i.e. FT-MPI implementations). Our findings show that combining HPC and HA architectures is feasible, in order to achieve HA cluster that is used for High Performance Computing. 相似文献

3.

集群式高性能计算系统研究

陈红梅张纪英《计算机时代》2015,(7)

研究了集群的系统结构和主要优势,以及集群式高性能计算系统的诞生;分析了集群式高性能计算系统的架构和构建方式,集群构建包括网络部署、存储系统、计算节点、管理节点、登录节点等部分。在此基础上构建了基于Linux的集群式高性能计算系统。相似文献

4.

Astrocomp: web technologies for high performance computing on a network of supercomputers

A. Costa U. Becciani V. Antonuccio P. Di Matteo 《Computer Physics Communications》2005,166(1):17-25

Astrocomp is a project developed by the INAF-Astrophysical Observatory of Catania, University of Roma La Sapienza and Enea in collaboration with Oneiros s.r.l. The project has the goal of building a web-based user-friendly interface which allows the international community to run some parallel codes on a set of high-performance computing (HPC) resources, with no need for specific knowledge about Unix and Operating Systems commands. Astrocomp provides CPU times, on parallel systems, available to the authorized user. The portal makes codes for astronomy available: FLY code, a cosmological code for studying three-dimensional collisionless self-gravitating systems with periodic boundary conditions [Becciani, Antonuccio, Comput. Phys. Comm. 136 (2001) 54]. ATD treecode, a parallel tree-code for the simulation of the dynamics of self-gravitating systems [Miocchi, Capuzzo Dolcetta, A&A 382 (2002) 758]. MARA a code for stellar light curves analysis [Rodonò et al., A&A 371 (2001) 174]. Other codes will be added to the portal in the future. 相似文献

5.

用于高性能计算的作业调度能效性研究综述

郑文旭潘晓东马迪汪浩《计算机工程与科学》2019,41(9):1526-1533

由于科学研究与商业应用等对高性能计算的需求与日俱增,高性能计算的性能和系统规模得到迅速发展。但是,急剧增长的功耗严重限制了高性能计算系统的设计和使用,使得低功耗技术成为高性能计算领域的关键技术。作为整个系统的核心组件,作业调度系统立足有限的系统资源,对用户提交的应用进行作业-资源分配,其能效性对于整个高性能计算系统的能耗控制与调节起到至关重要的作用。首先介绍主要的能量效率技术和常用的作业调度策略,然后对当前高性能计算作业调度能效性进行分析,并讨论了其面临的挑战及未来发展方向。相似文献

6.

The review of state-of-the-art processorarchitectures for high performance computing

WANG Yao-hua GUO Yang 《计算机工程与科学》2021,42(10):1742

相似文献

7.

多核机群上数据密集型应用并行程序性能优化

黄华林钟诚《计算机工程与应用》2012,48(30):73-77

在异构多核机群系统上利用数据任务块的动态调度策略和全锁定技术,给出一种面向数据密集型应用的结点内主存和可用的共享二级缓存大小中动态调度数据块的多进程级和多线程级并行编程机制,给出了优化数据密集型应用并行程序性能的策略和技术。在多核计算机组成的异构机群上并行求解随机序列多关键字查找的实验结果表明,所给出的多核并行程序设计机制和性能优化方法可行和高效。相似文献

8.

KernelHive: a new workflow‐based framework for multilevel high performance computing using clusters and workstations with CPUs and GPUs

Pawe&#x; Rociszewski Pawe&#x; Czarnul Rafa&#x; Lewandowski Marcel Schally‐Kacprzak 《Concurrency and Computation》2016,28(9):2586-2607

The paper presents a new open‐source framework called KernelHive for multilevel parallelization of computations among various clusters, cluster nodes, and finally, among both CPUs and GPUs for a particular application. An application is modeled as an acyclic directed graph with a possibility to run nodes in parallel and automatic expansion of nodes (called node unrolling) depending on the number of computation units available. A methodology is proposed for parallelization and mapping of an application to the environment that includes selection of devices using a chosen optimizer, selection of best grid configurations for compute devices, optimization of data partitioning and the execution. One of possibly many scheduling algorithms can be selected considering execution time, power consumption, and so on. An easy‐to‐use GUI is provided for modeling and monitoring with a repository of ready‐to‐use constructs and computational kernels. The methodology, execution times, and scalability have been demonstrated for a distributed and parallel password‐breaking example run in a heterogeneous environment with a cluster and servers with different numbers of nodes and both CPUs and GPUs. Additionally, performance of the framework has been compared with an MPI + OpenCL implementation using a parallel geospatial interpolation application employing up to 40 cluster nodes and 320 cores. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

9.

WKBZ简正波模型混合并行计算方法研究

范培勤刘晓妍过武宏崔宝龙《计算机工程与科学》2020,42(3):404-410

针对水声传播模型的计算量大,难以满足实时化、精细化水下声传播信息保障需求的难题,基于MPI+OpenMP混合并行编程方法,开展了WKBZ简正波模型混合并行计算方法研究,实现了水下声场2级混合并行计算。该方法通过节点间消息传递、节点内内存共享的方式,有效克服了MPI并行编程模型通信开销大和OpenMP并行编程环境可扩展性差的缺点,较好地解决了水下声传播快速计算的问题。测试结果表明,该方法能够较好地利用SMP集群节点间和节点内多级并行机制,充分发挥消息传递编程模型和共享内存编程模型各自的优势,大幅降低MPI进程间通信带来的时间开销,有效提升程序的可扩展性和并行效率。相似文献

10.

基于工作站机群的有限元结构分析并行计算

付朝江《计算机工程与应用》2008,44(23):236-238

并行计算正成为科学和工程计算中的一个新趋势。将采用区域分裂技术的并行有限元方法应用于工作站机群的分布式并行环境。提出了基于单元区域分裂的共轭梯度并行算法。在工作站机群上对坝体结构进行求解,对其并行性能进行分析。相似文献

11.

PC集群的结构和性能分析 总被引：1，自引：2，他引：1

吴琢琼彭勤科许宏斌胡保生《计算机工程与设计》2002,23(7):63-67

讨论了PC集群中的信道绑定和节点优化等方法及其对集群系统结构和性能的影响，设计和实现了几种基于BSPLib的PC集群性能测试算法，通过对研制的3个PC集群的性能评估，验证了提出的方法和算法的有效性，这些对设计低成本的PC集群能提供有益的帮助。相似文献

12.

SmartGridRPC: The new RPC model for high performance Grid computing

Thomas Brady Jack Dongarra Michele Guidolin Alexey Lastovetsky Keith Seymour 《Concurrency and Computation》2010,22(18):2467-2487

The paper presents the SmartGridRPC model, an extension of the GridRPC model, which aims to achieve higher performance. The traditional GridRPC provides a programming model and API for mapping individual tasks of an application in a distributed Grid environment, which is based on the client‐server model characterized by the star network topology. SmartGridRPC provides a programming model and API for mapping a group of tasks of an application in a distributed Grid environment, which is based on the fully connected network topology. The SmartGridRPC programming model and API and its performance advantages over the GridRPC model are outlined in this paper. In addition, experimental results using a real‐world application are also presented. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

13.

基于高性能计算的SWAT参数敏感度分析并行框架

李强陆忠华王彦棡陈曦罗毅《计算机应用研究》2015,32(1)

随着大规模水文模拟需求的不断提高,如何解决计算需求问题逐渐成为水文研究的一个热点.SWAT(soil and water assessment tool)模型在进行大规模水文模拟时有着良好的适应性与准确度,但其敏感度分析模块由于计算量过高,计算时长往往长达数月之久.为了加快SWAT敏感度分析的运行速度,针对SWAT敏感度分析模块的特点,基于MPI提出了一种高效的主—从式并行计算框架,并在此框架的基础上,通过将正演过程并行化,在敏感度分析的主—从并行框架中引入通信子空间的操作,将并行化的正演与主—从式的外层并行框架相结合,得到一种混合式的敏感度分析并行框架,大大提高了对参数集合的敏感度分析速度,将SWAT敏感度分析模块使用的处理器数量从原始的单核串行一跃提升到百核的数量级.最后通过天山北坡流域的模拟验证了此并行框架的可行性. 相似文献

14.

面向高性能计算的芯片组参数优化研究

方志斌胡鹏苗艳超安学军《计算机工程与设计》2008,29(7):1591-1595

介绍了一种面向高性能计算的芯片组,在设计和实现的基础上抽象出信道和交叉开关的环境参数,围绕高性能计算的通信特征分析了测试模型参数,并给出与性能评价相关的各个参数;建立了硬件FPGA测试平台和软件仿真环境,测试并分析了芯片组各环境参数对通信延迟和带宽的影响,总结出面向高性能计算的芯片组应尽量提高每次交易的传输粒度,确定了其信道参数. 相似文献

15.

基于高性能计算的开源云平台性能评估

李春艳张学杰《计算机应用》2013,33(12):3580-3585

云计算是一种提供各种IT服务的互联网资源利用的新模式,已经广泛地应用在包括高性能计算的各种领域。然而,虚拟化带来了一些性能开销;同时,不同的云平台实施虚拟化技术的不同,使得在这些云平台上应用高性能计算服务的性能也千差万别。通过HPC Challenge (HPCC) Benchmark和NAS Parallel Benchmark(NPB)分别对CPU、内存、网络、扩展性和高性能计算真实负载进行评估,比较并分析了诸如Nimbus、OpenNebula和OpenStack实施高性能计算的性能,实验显示OpenStack对计算密集型的高性能应用负载表现出较好的性能,因此,OpenStack是实施高性能计算的开源云平台的一个好的选择。相似文献

16.

面向HPC的函数计算冷启动优化

李哲谭郁松李宝余杰《计算机工程与科学》2020,42(11):1973-1980

High performance computing problems usually have the characteristics of parallelization of subtasks, and a lot of computing resources are consumed in the process of execution. It has been proved that traditional cloud computing based on virtual machine can deal with such problems, but the management of distributed environment and the distributed design of solutions make the processing more complex. Function computing is a new type of serverless cloud computing paradigm, its automatic expansion and considerable computing resources can be well combined with HPC problems. However, the cold start delay is an unavoidable problem on the public cloud function computing platform, especially in the task of HPC problems having high concurrent jobs of which delay will be further magnified. In this paper, we first analyze the completion time of a simple HPC task under cold start and hot start conditions, and analyze the causes of additional delay. According to these analyses, we combine the time series ana lysis tools and the platform's automatic expansion mechanism to propose an effective preheating method, which can effectively reduce the cold start delay of HPC tasks on the function computing platform. 相似文献

17.

POP海洋模式在四核至强集群上的并行计算 总被引：1，自引：0，他引：1

下载免费PDF全文

张理论赵军吴建平宋君强《计算机工程与应用》2009,45(5):189-192

分析了POP海洋模式原理、离散方法。在四核至强集群上,研究分析POP模式中计算局部块技术和平衡并行数据剖分及其对模式性能的影响。针对模式的通信性能瓶颈,采用聚合通信优化技术。研究结果表明局部块技术和数据剖分方式对于POP模式并行性能影响显著;通过通信聚合优化,POP模式在四核集群上性能获得明显提升。相似文献

18.

基于工作站机群并行求解有限元线性方程组 总被引：2，自引：0，他引：2

付朝江《计算机工程与设计》2008,29(24)

随着计算机高速网络技术的发展,工作站机群正在成为并行计算的主要平台.有限元线性方程组在土木工程结构分析中是最常见的问题.预处理共轭梯度法(PCGM)是求解线性方程组的迭代方法.对预处理共轭梯度法进行并行化并在两个不同的机群上实现,对存储方式进行详细分析,编程中采用了稀疏矩阵向量相乘的优化技术.数值结果表明,设计的并行算法具有良好的加速比和并行效率,说明并行计算能更快地求解大规模问题. 相似文献

19.

Formal methods applied to high‐performance computing software design: a case study of MPI one‐sided communication‐based locking

Salman Pervez Ganesh Gopalakrishnan Robert M. Kirby Rajeev Thakur William Gropp 《Software》2010,40(1):23-43

There is a growing need to address the complexity of verifying the numerous concurrent protocols employed in the high‐performance computing software. Today's approaches for verification consist of testing detailed implementations of these protocols. Unfortunately, this approach can seldom show the absence of bugs, and often results in serious bugs escaping into the deployed software. An approach called Model Checking has been demonstrated to be eminently helpful in debugging these protocols early in the software life cycle by offering the ability to represent and exhaustively analyze simplified formal protocol models. The effectiveness of model checking has yet to be adequately demonstrated in high‐performance computing. This paper presents a case study of a concurrent protocol that was thought to be sufficiently well tested, but proved to contain two very non‐obvious deadlocks in them. These bugs were automatically detected through model checking. The protocol models in which these bugs were detected were also easy to create. Recent work in our group demonstrates that even this tedium of model creation can be eliminated by employing dynamic source‐code‐level analysis methods. Our case study comes from the important domain of Message Passing Interface (MPI)‐based programming, which is universally employed for simulating and predicting anything from the structural integrity of combustion chambers to the path of hurricanes. We argue that model checking must be taught as well as used widely within HPC, given this and similar success stories. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

20.

基于LAMP的高性能计算用户组织架构管理系统设计与实现

吴君楠欧洋李琰《计算机工程与科学》2021,43(2):235-241

针对现有高性能计算用户组织架构管理系统面临的用户体验差、网络开销大和存取效率低等关键问题,提出了一种基于LAM P的高性能计算用户组织架构管理系统的实现方法.该方法采用B/S架构,T w ig与H T M L相结合的方式减轻了服务端的负担,改善了用户体验;采用RES T框架与Cache机制对海量临时数据进行缓存,降低了... 相似文献