期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

高性能计算机系统中可视化负载信息的获取及性能分析工具:THPTii 总被引：3，自引：0，他引：3

时培植李三立孟杰《小型微型计算机系统》2002,23(8):902-906

由于并行应用程序的运行效率往往很低，如何帮助程序员提高性能成为高性能计算中的重要问题，本文介绍了一个基于MPI的性能评价工具，它可以在应用程序运行的同时是收集系统负载信息，跟踪程序流程，根据硬件资源情况对处理机进行分组，并将负载信息和程序流程同时以图形方式展示，程序员可以藉此对并行应用程序运行情况进行监测，分析算法执行过程和系统负载的关系，找出性能瓶颈，发掘应用程序的潜力，最终提高应用程序的性能。相似文献

2.

尺子

《每周电脑报》1999,(7)

今日的微处理器和 PC 运行的程序和处理的数据越来越多,每片处理器在处理各种应用程序的时候性能都不可能完全相同。测试程序就是专门用来评估处理器和系统在运行整数、多媒体和浮点等应用程序时的性能,测试程序本身应该具备检查处理器或系统各项性能的能力。目前在 Internet 上的一些站点,已经发布了Pentium Ⅲ处理器的初步评测,结论是相对Pentium Ⅱ处理器没有太多的改善。英特尔公司的意见是:只有使用奔腾Ⅲ处理器、硬件的驱动程相似文献

3.

一种支持SIMD体系结构的高效分布式堆栈——HEDSSA

孙海燕《计算机工程与科学》2017,39(11):1986-1990

随着问题规模的增大和对实时性要求的提高,SIMD向量处理器尤其是带有向量运算单元的处理器在业界得到广泛应用。处理器上程序的运行状态一般由编译器通过堆栈进行管理。已有编译器堆栈设计机制在SIMD体系结构中严重影响了整个应用程序的运行性能。根据SIMD体系结构特点,提出了一种高效分布式堆栈设计方法——HEDSSA。实验结果表明,HEDSSA堆栈使得应用程序在进行局部数据访问、函数调用、发生中断以及动态分配数据时能够以更高的效率访问堆栈数据。相似文献

4.

事半功倍——Adaptec TOENAC7711C

张越《个人电脑》2003,9(12):45-45

TOE是TCP／IP Offload Engine的缩写。该技术是通过个额外的芯片夹控制TCP／IP的数据传输,从而减少对处理器资源的占用。通过高性能网络，我们可以应付网络上日益增多的数据，但是对服务器端而言大量网络数据传输占用了服务器过多的资源从而造成两络应用程序运行效率的下降。通过TOE技术，IT管理员可以在低成本投入的前提下，减轻服务器承担的网络传输负载．使服务器能谚专注于处理应用程序的运算任务，从而提升网络应用程序的用户请求响应等各种性能指标。相似文献

5.

一个基于硬件计数器的程序性能测试与分析工具 总被引：1，自引：0，他引：1

车永刚王正华李晓梅《计算机科学》2004,31(1):170-174

在Intel P6系列处理器与Microsoft Windows NT平台上开发了一个工具软件PTracker，它利用处理器中的硬件性能计数器来获取程序性能数据，并结合机器体系结构参数对数据进行分析。它无需用户编程，与应用程序所使用的编程语言无关，使用很方便。它不仅能够通过性能计数器获得精确的性能参数，而且还能通过对测试得到的性能数据的分析，揭示程序高层次的性能特征，对程序性能评价与优化具有一定的指导作用。本文介绍了PTracker的技术背景、设计与系统实现，并给出了一个应用实例。相似文献

6.

基于高性能DSP的嵌入式AUV数据融合处理系统的研究 总被引：1，自引：0，他引：1

刘云林先澄《计算机与数字工程》2005,33(9):23-27

对AUV数据融合处理的特点、要求与现状进行了分析，提出了一种基于并行处理的嵌入式高性能数据融合处理器体系结构并进行实现，对AUV数据融合进行了并行化设计与实验，证明了AUV嵌入式高性能数据融合处理器的高效性和可行性。相似文献

7.

面向处理器设计的快速性能评测方法

邓林张瑶罗家豪《计算机科学》2023,(11):15-22

面对日益复杂的处理器设计和有限的设计周期，如何有效地快速进行性能评估，是每一个处理器设计团队需要解决的问题。完整的性能测试集需要运行较长的时间，特别是在硅前验证阶段，高昂的时间成本导致设计团队无法使用完整的性能测试集进行性能评估分析。文中介绍了一种通用处理器快速性能评测方法(Fast-Eval),Fast-Eval性能评测方法基于SimPoint技术，使用FastParallel-BBV方法、最优模拟点的选取以及模拟点的热迁移等方法，显著缩短了BBV生成时间和性能测试时间。实验结果表明，相比完整运行SPEC CPU 2006 REF数据规模测试程序获得的性能数据，所提方法在ARM64处理器上BBV生成时间缩短为原来的16.88%,性能评估时间缩短为原来的1.26%,性能评估结果的平均相对误差为0.53%;在FPGA开发板上测试集的平均相对误差可以达到0.40%,运行时间仅为完整运行时间的0.93%。相似文献

8.

多核平台共享内存操作系统性能瓶颈分析及解决

袁清波赵健博陈明宇孙凝晖《计算机研究与发展》2011,48(12)

共享内存操作系统使用精心设计的锁来保护各种共享数据,对这些数据的访问需要首先获得对应的锁,当内核中同时有多个流程(系统调用、内核线程或中断处理程序等)试图获得同一个锁时会产生竞争,相关流程越多竞争就越激烈.随着系统中处理单元数目的增长,这些流程的数量也在不断增加,此时,对锁的竞争会影响系统的整体性能,甚至成为瓶颈.另一方面,操作系统与应用程序在同一处理器核上交替运行,因为硬件cache容量有限,导致操作系统的代码和数据经常替换掉应用程序的代码和数据.当应用程序重新被调度运行时,需从更慢速的cache,甚至从内存中读取这些代码和数据,从而降低了性能.通过在一台16核AMD节点上的相关测试,以上问题得到了量化验证,并针对这些问题提出了一种异构操作系统模型.在此模型下,应用程序和操作系统分别运行在不同的处理器核上,实验显示这种模式可以有效降低对锁的竞争和对cache的污染. 相似文献

9.

龙芯2号处理器系统优化关键技术

伍鸣张福新林伟许先超袁楠王剑《计算机研究与发展》2006,43(6):980-986

系统软件作为处理器和应用程序之间的接口，对于充分利用处理器的特性来维护处理器与应用程序的稳定性和提高应用程序的性能起着极其重要的作用．描述了在Linux内核中解决龙芯2号处理器的Cache别名问题的方法以及通过增加页的大小、软TLB和FAST_TLB_REFILL的方法减小TLB失效的性能损失，还有Uncache Accelerate对媒体播放软件的加速．实验结果表明，在系统软件中增加这些方法的支持，对系统的稳定性和性能都有较大的好处．相似文献

10.

面向高性能计算的众核处理器轻量级错误恢复技术研究

郑方沈莉李宏亮谢向辉《计算机研究与发展》2015,52(6)

随着半导体技术进步,单个芯片上集成大量核心的众核处理器已经广泛应用于高性能计算领域.相比多核处理器,众核处理器能提供更好的计算密度和能效比,但同时也面临越来越严重的可靠性挑战.需要设计高效的处理器容错机制,有效保证课题运行效率的同时不带来较大的芯片功耗和面积开销.在一款自主众核处理器DFMC(deeply fused and heterogeneous many-core)原型基础上,根据核心上运行的应用程序是否具有关联性特征,提出并实现了面向众核处理器的独立和协同2种轻量级错误恢复技术.其中,协同恢复技术由集中部件进行管理,通过协同恢复总线互连,出错时将与错误相关联的多个核心快速回卷到正确状态.2种错误恢复技术中,保留和恢复过程均通过定制的指令实现,恢复所需要的信息保留在运算核心内部,以保证对课题性能的影响最小化.实验表明,通过上述技术只增加了1.257％的芯片面积,可解决自主众核处理器约80％的瞬时错误,且对课题性能、芯片时序和功耗影响很小,可有效地提高众核处理器的容错能力. 相似文献

11.

Towards highly available and scalable high performance clusters

Azzedine Boukerche Raed A. Al-Shaikh Mirela Sechi Moretti Annoni Notare 《Journal of Computer and System Sciences》2007,73(8):1240-1251

In recent years, we have witnessed a growing interest in high performance computing (HPC) using a cluster of workstations. This growth made it affordable to individuals to have exclusive access to their own supercomputers. However, one of the challenges in a clustered environment is to keep system failure to the minimum and to achieve the highest possible level of system availability. High-Availability (HA) computing attempts to avoid the problems of unexpected failures through active redundancy and preemptive measures. Since the price of hardware components are significantly dropping, we propose to combine both HPC and HA concepts and layout the design of a HA-HPC cluster, considering all possible measures. In particular, we explore the hardware and the management layers of the HA-HPC cluster design, as well as a more focused study on the parallel-applications layer (i.e. FT-MPI implementations). Our findings show that combining HPC and HA architectures is feasible, in order to achieve HA cluster that is used for High Performance Computing. 相似文献

12.

浅谈高性能计算的地位及应用 总被引：4，自引：0，他引：4

李根国桂亚东刘欣《计算机应用与软件》2006,23(9):3-4,18

高性能计算已被公认为继理论科学和实验科学之后，人类认识世界改造世界的第三大科学研究方法。高性能计算应用在高性能计算技术的支持下为我国的科技创新作出了巨大贡献，并且和高性能计算技术在相辅相成中不断发展。从应用的角度概要总结了高性能计算的作用和地位，列举了几个产业化相关的高性能计算应用，介绍了作为公共服务平台的上海超级计算中心的情况。相似文献

13.

Scientific workflow orchestration interoperating HTC and HPC resources 总被引：1，自引：0，他引：1

Luis Cabellos Isabel Campos Enol Fernández-del-Castillo Micha? Owsiak Bartek Palak Marcin P?óciennik 《Computer Physics Communications》2011,(4):890-897

In this work we describe our developments towards the provision of a unified access method to different types of computing infrastructures at the interoperation level. For that, we have developed a middleware suite which bridges not interoperable middleware stacks used for building distributed computing infrastructures, UNICORE and gLite. Our solution allows to transparently access and operate on HPC and HTC resources from a single interface. Using Kepler as workflow manager, we provide users with the needed integration of codes to create scientific workflows accessing both types of infrastructures. 相似文献

14.

银行卡业务数据的高效过滤方法

冀乃庚傅宜生吕强《计算机应用与软件》2013,(10)

针对银行卡业务系统越来越复杂的数据过滤需求,提出一种高效的数据过滤方法。相比现有方法,该方法高度抽象的应用模型以及高效查找引擎的设计,具有更高的扩展性和性能。相似文献

15.

Care HPS: A high performance simulation tool for parallel and distributed agent-based modeling

《Future Generation Computer Systems》2017

Parallel and distributed simulation is a powerful tool for developing complex agent-based simulation. Complex simulations require parallel and distributed high performance computing solutions. It is necessary because their sequential solutions are not able to give answers in a feasible total execution time. Therefore, for the advance of computing science, it is important that High Performance Computing (HPC) techniques and solutions be proposed and studied. In literature, we can find some agent-based modeling and simulation tools that use HPC. However, none of these tools are designed to enable the HPC expert to be able to propose new techniques and solutions without great effort. In this paper, we introduce Care High Performance Simulation (HPS), which is a scientific instrument that enables researchers to: (1) develop techniques and solutions of high performance distributed simulations for agent-based models; and, (2) study, design and implement complex agent-based models that require HPC solutions. Care HPS was designed to easily and quickly develop new agent-based models. It was also designed to extend and implement new solutions for the main issues of parallel and distributed solutions such as: synchronization, communication, load and computing balancing, and partitioning algorithms. We conducted some experiments with the aim of showing the completeness and functionality of Care HPS. As a result, we show that Care HPS can be used as a scientific instrument for the advance of the agent-based parallel and distributed simulations field. 相似文献

16.

Understanding the future of energy-performance trade-off via DVFS in HPC environments

M. Etinski J. Corbalan J. Labarta M. Valero 《Journal of Parallel and Distributed Computing》2012

DVFS is a ubiquitous technique for CPU power management in modern computing systems. Reducing processor frequency/voltage leads to a decrease of CPU power consumption and an increase in the execution time. In this paper, we analyze which application/platform characteristics are necessary for a successful energy-performance trade-off of large scale parallel applications. We present a model that gives an upper bound on performance loss due to frequency scaling using the application parallel efficiency. The model was validated with performance measurements of large scale parallel applications. Then we track how application sensitivity to frequency scaling evolved over the last decade for different cluster generations. Finally, we study how cluster power consumption characteristics together with application sensitivity to frequency scaling determine the energy effectiveness of the DVFS technique. 相似文献

17.

Parallel Implementation of a Lagrangian Stochastic Model for Pollutant Dispersion

Debora?R.?Roberti Roberto?P.?Souto Haroldo?F.?Campos?de?Velho Email author Gervasio?A.?Degrazia Domenico?Anfossi 《International journal of parallel programming》2005,33(5):485-498

Lagrangian dispersion models have shown to be effective and reliable tools for simulating the airborne pollutant dispersion. However, the main drawback for their use as regulatory models is the associated high computational costs. Consequently, in this paper a parallel version of a Lagrangian particle model—LAMBDA—is developed using the MPI message passing communication library. Performance tests were executed in a distributed memory parallel machine, a multicomputer based on IA-32 architecture. Portions of the pollutant in the air emitted from its source are simulated as fictitious particles whose trajectories evolve under stochastic forcing. This yields independent evolution equations for each particle of the model that can be computed by a different processor in a parallel implementation. Speed-up results show that the parallel implementation is suitable for the used architecture. 相似文献

18.

Exploiting performance counters to predict and improve energy performance of HPC systems

《Future Generation Computer Systems》2014

Hardware monitoring through performance counters is available on almost all modern processors. Although these counters are originally designed for performance tuning, they have also been used for evaluating power consumption. We propose two approaches for modelling and understanding the behaviour of high performance computing (HPC) systems relying on hardware monitoring counters. We evaluate the effectiveness of our system modelling approach considering both optimizing the energy usage of HPC systems and predicting HPC applications’ energy consumption as target objectives. Although hardware monitoring counters are used for modelling the system, other methods–including partial phase recognition and cross platform energy prediction–are used for energy optimization and prediction. Experimental results for energy prediction demonstrate that we can accurately predict the peak energy consumption of an application on a target platform; whereas, results for energy optimization indicate that with no a priori knowledge of workloads sharing the platform we can save up to 24% of the overall HPC system’s energy consumption under benchmarks and real-life workloads. 相似文献

19.

High performance computing methods for the integration and analysis of biomedical data using SAS

Justin R. Brown Valentin Dinu 《Computer methods and programs in biomedicine》2013

From microarrays and next generation sequencing to clinical records, the amount of biomedical data is growing at an exponential rate. Handling and analyzing these large amounts of data demands that computing power and methodologies keep pace. The goal of this paper is to illustrate how high performance computing methods in SAS can be easily implemented without the need of extensive computer programming knowledge or access to supercomputing clusters to help address the challenges posed by large biomedical datasets. We illustrate the utility of database connectivity, pipeline parallelism, multi-core parallel process and distributed processing across multiple machines. Simulation results are presented for parallel and distributed processing. Finally, a discussion of the costs and benefits of such methods compared to traditional HPC supercomputing clusters is given. 相似文献

20.

集群高速互连网络分析

李涛陈宇明赵精龙倪长顺杨愚鲁《计算机科学》2005,32(10):20-22

集群是当今高性能计算领域的重要发展方向,高速互连网络是构建高性能集群系统的关键技术,它是影响集群系统整体性能的关键因素.本文对几种用于集群互连的高带宽、低延迟高速互连网络进行了分析与比较,最后指出了高速互连网络的未来发展. 相似文献