首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 732 毫秒
1.
克服机群系统通信瓶颈的软件方法   总被引:4,自引:1,他引:4  
机群系统是一种新兴的并行计算系统 ,它能够以较低的价格提供很高的计算性能 ,因此有广泛的应用前景 .机群系统从整体上说是一种松耦合的体系结构 ,通信是其性能提高的主要瓶颈 .本文首先针对机群系统通信问题做了简要分析 ,在此基础上论述了软件在提高通信性能方面的重要地位 ,着重讨论了如何通过软件方法来提高机群系统性能的途径 .实验结果表明 ,对于许多问题 ,基于相同的硬件支持 ,可以通过软件方法 ,成倍地提高其在机群系统上的求解性能  相似文献   

2.
The SMILE project main aim is to build an efficient low-cost cluster based on FPGA boards in order to take advantage of its reconfigurable capabilities. This paper shows the cluster architecture, describing: the SMILE nodes, the high-speed communication network for the nodes and the software environment. Simulating complex applications can be very hard, therefore a SystemC model of the whole system has been designed to simplify this task and provide error-free downloading and execution of the applications in the cluster. The hardware–software co-design process involved in the architecture and SystemC design is presented as well. The SMILE cluster functionality is tested executing a real complex Content-Based Information Retrieval (CBIR) parallel application and the performance of the cluster is compared (time, power and cost) with a traditional cluster approach.  相似文献   

3.
现代高性能数字信号处理器大多数采用超长指令字体系结构,通过在同一时钟周期发射多条指令以便获得更高的运算性能来发掘目标机器指令级别并行性.介绍了BW104x目标体系特征,BWDSP104X是一款针对高性能计算领域设计的处理器,采用16发射、单指令流,多数据流架构.为了充分利用多簇及簇内硬件资源,基于open64编译基础设施提出了后端软流水优化,其中包括循环选择,资源依赖数据依赖计算,采用经典的模调度方法进行软流水调度,为解决不同迭代变量冲突引入模变量拓展模块.实验结果证明流水后性能相对流水前有了很好的提升.  相似文献   

4.
FMP:一种适用于机群系统的快速消息传递机制   总被引:8,自引:1,他引:8  
目前,网络通信软件的处理开销已成为影响机群系统性能的瓶颈,为了提高机群系统的网络性能,本文设计了一种用于机群系统的快速消息传递机制FMP,并在Myrinet网络上加以实现。在Ultra2上的测试表明,FMP单字节数据包单向通信延迟为11.2μs,8K数据包网络带宽达到338Mb/s,较好地发挥了Myrinet网络硬件的性能,达到了减少通信开销,提高网络带宽的目的。  相似文献   

5.
基于LabVIEW的远程测控系统的设计   总被引:4,自引:0,他引:4  
刘颖  孙先逵  秦岚 《测控技术》2005,24(9):43-45,64
设计了在虚拟仪器的硬件结构和常用开发软件环境下的远程测控系统.该系统包括了数据采集子系统和远程测控子系统.数据采集子系统中包括3个模块:数据采集模块、数据处理模块、仪器控制模块.初步试验表明本设计的基于LabVIEW的远程测控系统是成功的,可以用于野外或危险环境下的温度、湿度等参量的远程测控.  相似文献   

6.
王芳 《软件》2011,32(6):57-59,66
目前众多领域对大规模计算逐渐显示出一种迫切的需求,集群技术在高性能计算领域中的理论研究和实践越来越得到广泛的关注。高性能计算是解决大规模聚变模拟计算的有效途径。本文介绍如何基于集群技术构建聚变研究高性能计算系统,分析了系统的基本体系结构,从硬件和软件两个方面详细解析其高性价比的构建技术。  相似文献   

7.
The core business of many companies depends on the timely analysis of large quantities of new data. MapReduce clusters that routinely process petabytes of data represent a new entity in the evolving landscape of clouds and data centers. During the lifetime of a data center, old hardware needs to be eventually replaced by new hardware. The hardware selection process needs to be driven by performance objectives of the existing production workloads. In this work, we present a general framework, called Ariel, that automates system administrators’ efforts for evaluating different hardware choices and predicting completion times of MapReduce applications for their migration to a Hadoop cluster based on the new hardware. The proposed framework consists of two key components: (i) a set of microbenchmarks to profile the MapReduce processing pipeline on a given platform, and (ii) a regression-based model that establishes a performance relationship between the source and target platforms. Benchmarking and model derivation can be done using a small test cluster based on new hardware. However, the designed model can be used for predicting the jobs’ completion time on a large Hadoop cluster and be applied for its sizing to achieve desirable service level objectives (SLOs). We validate the effectiveness of the proposed approach using a set of twelve realistic MapReduce applications and three different hardware platforms. The evaluation study justifies our design choices and shows that the derived model accurately predicts performance of the test applications. The predicted completion times of eleven applications (out of twelve) are within 10% of the measured completion times on the target platforms.  相似文献   

8.
高性能计算是解决大规模聚变模拟计算的有效途径.介绍如何基于集群技术构建聚变研究高性能计算系统,分析了系统的基本体系结构,从硬件和软件两个方面详细解析其高性价比的构建技术.最后,运用基准测试程序(LinPACK和NPB) 对这一系统进行测试,显示了该系统高效的并行计算性能.  相似文献   

9.
VIA及其设计与实现   总被引:1,自引:0,他引:1  
谢军  焦振强  唐瑞春  都志辉 《计算机工程》2002,28(10):233-235,263
VIA定义了一种低延迟、高带宽的数据传输模型,成为机群间通信的工业标准,该文介绍了VIA的产生背景、结构特征,详细阐述了清华大学VIA的实现(TH-VIA),包括软硬件特点,采用的主要技术,最后给出了几种VIA实现方案的测试结果和比较分析。  相似文献   

10.
The designer of a numerical supercomputer is confronted with fundamental design decisions stemming from some basic dichotomies in supercomputer technology and architecture. On the side of the hardware technology there exists the dichotomy between the use of very high-speed circuitry or very large-scale integrated circuitry. On the side of the architecture there exists the dichotomy between the SIMD vector machine and the MIMD multiprocessor architecture. In the latter case, the ‘nodes’ of the system may communicate through shared memory, or each node has only private memory, and communication takes place through the exchange of messages. All these design decisions have implications with respect to performance, cost-effectiveness, software complexity, and fault-tolerance.

In the paper the various dichotomies are discussed and a rationale is provided for the decision to realize the SUPRENUM supercomputer, a large ‘number cruncher’ with 5 Gflops peak performance, in the form of a massively parallel MIMD/SIMD multicomputer architecture. In its present incorporation, SUPRENUM is configurable to up to 256 nodes, where each node is a pipeline vector machine with 20 Mflops peak performance, IEEE double precision. The crucial issues of such an architecture, which we consider the trendsetter for future numerical supercomputer architecture in general, are on the hardware side the need for a bottleneck-free interconnection structure as well as the highest possible node performance obtained with the highest possible packaging density, in order to accommodate a node on a single circuit board. On the side of the system software the design goal is to obtain an adequately high degree of operational safety and data security with minimum software overhead. On the side of the user an appropriate program development environment must be provided. Last but not least, the system must exhibit a high degree of fault tolerance, if for nothing else but for the sake of obtaining a sufficiently high MTBF.

In the paper a detailed discussion of the hardware and software architecture of the SUPRENUM supercomputer, whose design is based upon the considerations discussed, is presented. A largely bottleneck-free interconnection structure is accomplished in a hierarchical manner: the machine consists of up to 16 ‘clusters’, and each cluster consists of 16 working ‘nodes’ plus some organisational nodes. The node is accommodated on a single circuit board; its architecture is based on the principle of data structure architecture explained in the paper. SUPRENUM is strictly a message-based system; consequently, the local node operating system has been designed to handle a secured message exchange with a considerable degree of hardware support and with the lowest possible software overhead. SUPRENUM is organized as a distributed system—a prerequisite for the high degree of fault tolerance required; therefore, there exists no centralized global operating system. The paper concludes with an outlook on the performance limits of a future supercomputer architecture of the SUPRENUM type.  相似文献   


11.
针对传统的双PWM变换器的异步电机矢量控制误差率很高这一问题,设计了一种新的异步电机矢量控制系统,分别对系统硬件和软件进行设计,硬件架构图中重点对硬件电机结构、DSP芯片和编程器进行设计。根据设计的系统硬件建立软件工作流程,软件共分为系统初始化、中断控制、PWM信号封锁、报警保护和数据通信五步,对每一个工作步骤做了详细的阐述。通过仿真实验验证了该矢量控制系统的实际工作效果,实验结果表明,所设计的系统具有很强的鲁棒性,控制性能远远好于传统系统,市场发展潜能大,对于异步电机发展有一定的指导意义。  相似文献   

12.
基于Myrinet的用户空间精简协议   总被引:5,自引:0,他引:5  
董春雷  郑纬民 《软件学报》1999,10(3):299-303
通信子系统是影响工作站机群系统整体性能的主要因素.文章在分析和比较了3种常用的网络性能之后,指出上层协议的处理是影响工作站机群系统性能的主要瓶颈.在由640Mbps的Myrinet连接的8台Sun SPARC工作站组成的机群系统上实现了一个用户层的高性能的精简通信协议——RCP(reduced communication protocol).通过精简协议的冗余功能、减少数据拷贝次数和直接操作硬件缓冲区等方法,达到低延迟、高效率.RCP的回路延迟时间比TCP/IP小得多(200μs vs 1 540μs),  相似文献   

13.
This paper discusses a multithreaded software architecture for message-passing interface (MPI) software specification. The architecture is thread-safe, allows for concurrent communication over several communications media (multifabric communication), efficiently utilizes available hardware concurrency over a wide range of target platforms, and allows for concurrent communication and computation within the limits imposed by the hardware. The architecture is developed in the framework of the MPICH software architecture, a well-known MPI implementation used worldwide. The proposed architecture adopts wide portability of the MPICH design and remedies some of its deficiencies such as inefficient multifabric communication and non-thread-safety. The paper also considers the issues concerning development of high-performance portable message-passing systems for general-purpose architectures. The contributions of the paper are improving architecture and addressing thread safety of modern reliable messaging software, as well as identifying and taking advantage of inherent concurrency in the message-passing software itself.  相似文献   

14.
We present the Smart Surface Network (SSN), a hardware and software platform designed for dense sensing. Sensor nodes connected to the SSN communicate using a serial bus integrated within a mountable physical surface. The hardware architecture and bus access and communication mechanisms are implemented in a self-stabilizing manner, providing robust handling of unannounced arrivals and departures of network devices. An associated API supports a peer-to-peer communication paradigm, providing access to the physical, data link, and application layers of the bus. In this paper, we describe the SSN hardware architecture and present the bus access and peer discovery algorithms. We also discuss the design of the API and describe experimental results characterizing the fairness of the bus algorithm, the efficiency of the peer discovery algorithm, and the performance of the SSN system under varying load conditions.  相似文献   

15.
Steenkiste  P.A. 《Computer》1994,27(3):47-57
Optical fiber has made it possible to build networks with link speeds of over a gigabit per second; however, these networks are pushing end-systems to their limits. For high-speed networks (100 Mbits per second and up), network throughput is typically limited by software overhead on the sending and receiving hosts. Minimizing this overhead improves application-level latency and throughput and reduces the number of cycles that applications lose to communication overhead. Several factors influence communication overhead: communication protocols, the application programming interface (API). and the network interface hardware architecture. The author describes how these factors influence communication performance and under what conditions hardware support on the network adapter can reduce overhead. He first describes the organization of a typical network interface and discusses performance considerations for interfaces to high-speed networks. He then discusses software optimizations that apply to simple network adapters and show how more powerful adapters can improve performance on high-speed networks  相似文献   

16.
IP可视电话视频会议系统设计与实现   总被引:3,自引:0,他引:3  
从H.323协议族和MCU系统的软硬件结构设计出发,讨论了一种基于H.323协议的IP可视电话视频会议系统。它是一个针对特殊需求而设计的嵌入式系统,在网络层采用分组交换技术,以提高网络资源的利用率;硬件采用双板结构,将音/视频编/解码、数据封装等支撑硬件放在母板上,将网络通信等功能放在子板上;板间通信采用了流机制,软件系统采取了数据驱动的方式。上述方案能够为视频会议系统提供高可靠性、低廉的成本以及较好的可移植性,对类似的小型视频会议系统的研究和开发提供了一定的参考价值。  相似文献   

17.
18.
康炜  张翔  王金伟  苗艳超  马捷 《计算机工程》2008,34(10):256-258
机群系统已成为高性能计算的主流体系结构,机群模拟环境是学习机群操作的重要工具。该文提出一种基于龙芯2E多处理器硬件平台的机群模拟方案——虚拟机群系统(VCS)。该系统在共享内存的多处理器上同时运行多个操作系统并使用内存操作模拟网络通信,实现机群环境的模拟。  相似文献   

19.
Highly regular multi-processor architectures are suitable for inherently highly parallelizable applications such as most of the image processing domain. Systems embedded in a single programmable chip platform (SoPC) allow hardware designers to tailor every aspect of the architecture in order to match the specific application needs. These platforms are now large enough to embed an increasing number of cores, allowing implementation of a multi-processor architecture with an embedded communication network. In this paper we present the parallelization and the embedding of a real time image stabilization algorithm on a SoPC platform. Our overall hardware implementation method is based upon meeting algorithm processing power requirements and communication needs with refinement of a generic parallel architecture model. Actual implementation is done by the choice and parameterization of readily available reconfigurable hardware modules and customizable commercially available IPs (Intellectual Property). We present both software and hardware implementation with performance results on a Xilinx SoPC target.  相似文献   

20.
IXP2400网络处理器及其微引擎中多线程实现的研究   总被引:2,自引:0,他引:2  
网络处理器兼顾了ASIC的高性能和RISC芯片的可编程灵活性,能较好地满足数据通信高速发展的要求,在将来的网络设备中,有广阔的应用前景。IXP2400是Intel公司推出的第二代网络处理器。它采用了高性能的并行体系结构来处理复杂的算法、包内容检测、流量管理和线速转发。多线程技术是IXP2400实现高速数据处理的关键技术。该文介绍了IXP2400的硬件结构及软件开发,并分析了其微引擎中多线程实现的有关技术。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号