期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

姚渺裴巍单珊孟波杨愚鲁《计算机工程与应用》2005,41(17):156-159,196

集群系统通信性能作为影响集群性能的主要因素之一,其测量对寻找集群内部通信瓶颈具有指导作用。采用NetPIPE基准测试对PC集群系统和Sun工作站集群的通信性能进行了测量,实验结果与理论分析一致,表明在通信性能方面,MPI环境整体上优于PVM,合并一些非相关短消息为长消息能够优化集群应用。并采用性能模拟的方法,以基准测试为工具,对两个集群系统的带参数LogP通信模型进行了定量化地测量和计算,完整表征了集群通信子系统的通信性能特征。相似文献

2.

基于智能存储设备的共享存储集群系统研究

黄蓉俞建新《计算机工程与设计》2007,28(16):3928-3931

随着越来越多的机构采用集群计算技术来实现高性能计算(HPC)--地球、海洋和大气科学、地震数据分析等科学研究和药物研究、汽车设计模型、商业业务冒险分析等商业应用,集群计算技术已经进化为开发高性能计算系统的主要的方法.所有的这些应用的计算都是公认的复杂.能够有效地管理与这些应用密切相关的数据集驱动了现在的集群计算技术的发展.采用智能存储设备(OSD)大大简化了元数据服务器的工作量,并且使得系统的管理和效率都得到了很大的提高.集中描述一个新的存储体系结构--基于智能OSD的共享存储集群计算系统. 相似文献

3.

同构应用在计算网格中的子作业指派

王庆江桂小林郑守洪《计算机工程》2005,31(3):32-34

为改进同构应用在计算网格中的执行性能,提出了子作业指派方法。对于计算密集的应用,任务间通信是可忽略的,故一个这样的作业被划分为若干子作业,不同的子作业被分别指派到不同的机群,该作业划分是根据网格负载平衡完成的。非计算密集的应用在多站点计算时很少取得令人满意的性能,故一个这样的作业被整体指派到某个机群。为找出最适合机群,对每个机群的处理机性能和处理机间通信性能进行测量,并根据应用性能模型预测作业运行时间。实验显示,该子作业指派方法在优化同构应用的执行性能上是有效的。相似文献

4.

PC机群上共享存储与消息传递的比较 总被引：7，自引：0，他引：7

下载免费PDF全文

章隆兵吴少刚蔡飞胡伟武《软件学报》2004,15(6):842-849

共享存储和消息传递是目前两种主流的并行编程模型.一般认为,消息传递的可编程性不及共享存储友好.OpenMP是目前共享存储编程的实际工业标准.机群OpenMP系统在机群上提供了OpenMP编程环境,具有易编程和可扩展的特点,但是其性能如何一直是关注的热点.以机群OpenMP系统OpenMP/JIAJIA和典型的消息传递系相似文献

5.

Performance Analysis of Round Robin Scheduling for Session-based Applications on Clusters

CHEN YAN XU GUOZHI 《微计算机信息》2007,(24):181-183

Nowadays session-based applications are one of the typical applications in the Internet,and people build such applications on clusters on concern of scalability. Scheduling in such a cluster is a key technology since system performance depends on it. In this paper,we investigate the Round-Robin algorithm in the context of Session-based applications. An analyzing model for such sys-tems is proposed. Through both theoretical analysis and simulation,we find the main factor for system performance. And the result also shows that this algorithm shows up with significantly different performance under various conditions. 相似文献

6.

Content-based image retrieval algorithm acceleration in a low-cost reconfigurable FPGA cluster

C. Pedraza E. Castillo J. Castillo J.L. Bosque J.I. Martinez O.D. Robles J. Cano P. Huerta 《Journal of Systems Architecture》2010,56(11):633-640

The SMILE project main aim is to build an efficient low-cost cluster based on FPGA boards in order to take advantage of its reconfigurable capabilities. This paper shows the cluster architecture, describing: the SMILE nodes, the high-speed communication network for the nodes and the software environment. Simulating complex applications can be very hard, therefore a SystemC model of the whole system has been designed to simplify this task and provide error-free downloading and execution of the applications in the cluster. The hardware–software co-design process involved in the architecture and SystemC design is presented as well. The SMILE cluster functionality is tested executing a real complex Content-Based Information Retrieval (CBIR) parallel application and the performance of the cluster is compared (time, power and cost) with a traditional cluster approach. 相似文献

7.

Performance and usability tradeoff in a cluster display wall

《Computer Standards & Interfaces》2019

Cluster-based display walls provide cost-effective and scalable display infrastructures with high resolution and large display area, making them suitable for a wide range of high-resolution applications. As a consequence, a wide offer of new cluster display-wall platforms together with their software frameworks have been proposed. Their performance and the satisfaction of their users have aroused the interest of some researchers. This work is focused on the Liquid Galaxy cluster display wall originally built to run Google Earth to create an immersive experience for the users. In this paper, the Liquid Galaxy is benchmarked running Google Earth, as a representative interactive application with high performance requirements, in different configurations and environments, to test the satisfaction, effectiveness and efficiency. Thus, we wish to know how users react to the system performance. In order to do so, we use a performance metric defined in previous research to relate the performance of the system with the user’s perception. Taking into account the trend of this metric in the experimentation, we model the behavior of the system in a way that the performance for any given visualization cluster running Google Earth could be predicted by using a reference system. 相似文献

8.

机群系统中有状态应用的调度算法研究

陈研徐国治《微计算机信息》2007,23(24):179-180,97

有状态的应用是当今因特网中较为典型的一种应用，利用机群系统可以有效地实现该种应用的扩容。请求调度算法是该种系统的关键技术，它决定机群系统的性能能否充分发挥。本文研究了轮循算法在有状态应用的使用。通过理论分析和实验仿真，指出了影响系统性能的主要因素。研究表明，该法在不同的应用条件下，有显著不同的性能表现。相似文献

9.

访存密集型应用在SMP机群系统中的性能分析

顾丽红吴少刚《小型微型计算机系统》2006,27(7):1258-1261

SMP机群系统因其良好的性价比、卓越的可扩展性与可用性，逐渐成为当前高性能计算机领域的主流结构．这种结点内共享存储、结点间消息传递的两级混合结构是目前并行计算研究的热点,在单个SMP结点中，总线和内存带宽是否满足CPU和I／O的需求对于访存密集型应用的性能影响很大。本文针对访存密集型应用的特点测试分析了在SMP机群中访存冲突对系统性能的影响，结果表明我们的SMP结点存在性能瓶颈，这种量化分析对于设计大规模的基于SMP的机群系统有很好的指导意义．相似文献

10.

基于智能网卡支持的并行通信协议

林基周小成孟丹《计算机研究与发展》2005,42(6):971-978

网络通信系统是机群的一个重要组成部分,也是影响机群整机处理效率的关键因素．随着单个结点计算能力的增强,网络通信能力也需要相应地提高．一种提高网络通信能力的方法是引入多个网卡同时进行消息发送,即并行通信．通常,并行通信是基于RMA机制实现的,对于小于17KB的消息,由于RMA机制的握手过程使得并行通信性能的提高很有限．提出了基于智能网卡支持的并行通信协议．该协议将消息重组所需的握手过程下移到网卡上实现,从而减少了握手开销,扩展了并行通信的范围．实验数据表明,与基于RMA机制的并行协议相比,该协议提高了3KB-17KB消息段的通信性能;对应用程序,如FT程序,该协议将其执行时间减少了9．4％,而基于RMA机制的并行协议只减少了7．8％．最后分析了限制并行通信性能提高的主要因素．相似文献

11.

Sockvia--机群环境中的高效socket

虞岩松霍志刚马捷孟丹《计算机工程与应用》2005,41(18):117-121

随着机群研究的蓬勃发展和高性能网络的出现,机群通信系统的性能得到了大幅度地提升,该文针对普通的网络应用程序高效地移植到机群高性能通信系统之中的问题展开研究,提出了机群环境中的高效socket——sockvia。sockvia利用核心级的VIA作为底层的支持协议,在操作系统核心中提供了与基于TCP/IP的socket完全兼容的socket编程界面和运行环境,使得网络应用程序无需修改源码和重新编译连接,可透明地移植到机群高性能通信系统之中,同时sockvia还表现出理想的通信性能,经过标准的netperf测试,sockvia在AMD64位平台上最低延迟为9.71usec,最高带宽可达1974.85Mbit/sec。相似文献

12.

Performance evaluation of parallel iterative deepening A on clusters of workstations

Abdel-Elah 《Performance Evaluation》2005,60(1-4):223-236

In this paper we investigate the performance of distributed heuristic search methods based on a well-known heuristic search algorithm, the iterative deepening A^* (IDA^*). The contribution of this paper includes proposing and assessing a distributed algorithm for IDA^*. The assessment is based on space, time and solution quality that are quantified in terms of several performance parameters such as generated search space and real execution time among others. The experiments are conducted on a cluster computer system consisting of 16 hosts built around a general-purpose network. The objective of this research is to investigate the feasibility of cluster computing as an alternative for hosting applications requiring intensive graph search. The results reveal that cluster computing improves on the performance of IDA^* at a reasonable cost. 相似文献

13.

Scheduling real-time divisible loads with advance reservations

Anwar Mamat Ying Lu Jitender Deogun Steve Goddard 《Real-Time Systems》2012,48(3):264-293

Providing QoS and performance guarantee to arbitrarily divisible loads has become a significant problem for many cluster-based research computing facilities. While progress is being made in scheduling arbitrarily divisible loads, previous approaches have no support for advance reservations. However, with the emergence of Grid applications that require simultaneous access to multi-site resources, supporting advance reservations in a cluster has become increasingly important. In this paper we propose a new real-time divisible load scheduling algorithm that supports advance reservations in a cluster. The impact of advance reservations on system performance is systematically studied. Simulation results show that, with the proposed algorithm and appropriate advance reservations, the system performance could be maintained at the same level as the no reservation case. Thus, Our approach enforces the real-time agreement vis-a-vis addresses the under-utilization concerns. 相似文献

14.

一个适合大规模集群并行计算的检查点系统 总被引：4，自引：1，他引：4

周恩强卢宇彤沈志宇《计算机研究与发展》2005,42(6):987-992

分布式检查点系统是大规模并行计算系统容错的重要手段．协议开销和检查点映像存储成为困扰并行检查点系统可伸缩性的两大瓶颈．针对并行应用程序的执行特征和高性能集群的体系结构特点,C系统分别采用动态虚连接技术和分布存储检查点映像的方法来有效降低协同式检查点的开销,增强检查点系统的可伸缩性．初步测试结果表明,C系统的设计策略适合大规模并行计算的容错．相似文献

15.

一种动态网络负载平衡集群系统的实现

MA Zhong-kuang 《数字社区&智能家居》2008,(15)

集群系统近年来在计算机网络中的应用越来越广泛,提供服务的负载分配算法对集群的性能有很大的影响。本文通过对集群系统中的负载平衡算法的研究,在Linux下实现了一种集群系统动态网络负载平衡算法。通过实验结果分析,此算法能够提高集群系统服务程序的运行性能。相似文献

16.

一种基于内存服务的内存共享网格系统 总被引：1，自引：0，他引：1

褚瑞肖侬卢锡城《计算机学报》2006,29(7):1225-1233

内存密集型应用对运行环境的物理内存要求严格,在物理内存不足时将会引发大量磁盘IO,降低系统性能．传统的网络内存致力于在集群内部通过共享空闲节点的物理内存解决该问题,但受集群负载和内部网络影响较大．通过结合网络内存和服务计算、网格计算等技术,提出一种基于内存服务的内存共享网格系统——内存网格,并分析和讨论了实现内存服务的关键技术和算法．内存网格弥补了网络内存的不足,扩展了网格计算的应用范围．通过基于真实应用运行状态的模拟,证明了内存网格与网络内存相比具有性能的提高．相似文献

17.

一种基于深度学习的性能分析框架设计与实现

冯赟龙刘勇何王全《计算机工程与科学》2018,40(6):984-991

高性能计算系统的体系结构日益复杂和现有性能分析工具的智能程度不足,导致高性能计算应用的程序性能分析和优化的成本代价日益高昂。所幸,人工智能领域目前取得了重要进展,其中深度学习技术发挥了重要作用,它给性能分析工具的智能化带来了契机。提出一种基于深度学习的程序性能智能分析框架,其核心思想是将程序的性能分析问题抽象成可用机器学习技术描述的分类问题,使用处理器支持的PMU采集分类所需的性能数据并标准化,使用簇评估技术结合簇的实际含义确定性能问题类别,通过稀疏编码自动学习性能数据特征并构建性能问题分类模型。在神威太湖之光超级计算机上实现了程序性能分析框架原型。实验结果表明,该性能分析方法能够直观地指导程序员快速把握当前应用最为突出的性能瓶颈问题,提高应用优化的效率,降低用户调优代码的成本。相似文献

18.

A new fragment re-allocation strategy for NoSQL database systems

Zhikun?Chen Email author Shuqiang?Yang Shuang?Tan Li?He Hong?Yin Ge?Zhang 《Frontiers of Computer Science》2015,9(1):111-127

NoSQL databases are famed for the characteristics of high scalability, high availability, and high fault-tolerance. So NoSQL databases are used in a lot of applications. The data partitioning strategy and fragment allocation strategy directly affect NoSQL database systems’ performance. The data partition strategy of large, global databases is performed by horizontally, vertically partitioning or combination of both. In the general way the system scatters the related fragments as possible to improve operations’ parallel degree. But the operations are usually not very complicated in some applications, and an operation may access to more than one fragment. At the same time, those fragments which have to be accessed by an operation may interact with each other. The general allocation strategies will increase system’s communication cost during operations execution over sites. In order to improve those applications’ performance and enable NoSQL database systems to work efficiently, these applications’ fragments have to be allocated in a reasonable way that can reduce the communication cost i.e., to minimize the total volume of data transmitted during operations execution over sites. A strategy of clustering fragments based on hypergraph is proposed, which can cluster fragments which were accessed together in most operations to the same cluster. Themethod uses a weighted hypergraph to represent the fragments’ access pattern of operations. A hypergraph partitioning algorithmis used to cluster fragments in our strategy. This method can reduce the amount of sites that an operation has to span. So it can reduce the communication cost over sites. Experimental results confirm that the proposed technique will effectively contribute in solving fragments re-allocation problem in a specific application environment of NoSQL database system. 相似文献

19.

一种面向分布构件集群系统的配置管理模型

周雯周斌《计算机工程与应用》2004,40(8):71-74,208

分布构件技术是一种面向三层计算结构业务逻辑中间层的分布计算技术。分布构件集群系统,在分布构件基础之上,面向企业计算需求,提供了更好的可用性和更高的性能。文章研究分布构件集群系统的构造及配置管理内容;通过给出分布构件集群系统的配置管理模型,定义相关的配置管理关系和配置管理数据,使得能够在统一的管理视图上对分布构件集群系统及内部所安装部署的分布构件进行有效的管理。该模型是分布构件集群系统在运行时刻支持高可用和高性能的基础。相似文献

20.

QSNET/sup II/: defining high-performance network design

Beecroft J. Addison D. Hewson D. McLaren M. Roweth D. Petrini F. Nieplocha J. 《Micro, IEEE》2005,25(4):34-47

QSNET/sup II/ optimizes interprocessor communication in systems built from standard server building blocks. Its short-message processing unit permits fast injection of small messages, providing ultra-low latency and scalability to thousands of nodes. Thus, in a sense, the high-performance network in a cluster computer is the computer because it largely defines achievable performance, widening the range of the applications a cluster can efficiently execute, as well as defining its scalability, fault tolerance, system software, and overall usability. 相似文献