首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
本文通过对WEB系统结构的分析,比较了常用的WEB应用程序服务器的性能。然后,从扩展和缓存的角度提出了解决WEB系统瓶颈的方案。  相似文献   

2.
数据规模和并发访问的需求日益增长,可扩展能力成为并行文件系统的重要需求之一.文中提出了一种基于非对称并行文件系统Redbud的高可扩展资源管理机制.该管理机制根据数据的访问特征,使用不同的树形结构管理不同类型的数据,满足了文件数据和元数据的并发检索需求;该管理机制还使用文件级的数据分布机制,允许用户利用各种策略进行目录和文件的管理,能满足文件级的数据访问性能、目录级数据可靠性等实际应用需求.多个基准测试程序和实际应用程序的测试结果表明,文件的独占访问能达到磁盘95%的性能;同时,随着设备和应用节点的增加,数据和元数据的并发访问性能线性增长.  相似文献   

3.
4.
EH*p是一种使用奇偶编码对数据进行备份的高可用可扩展分布式数据结构.EH*p文件可以随着记录的插入而逐渐扩展到多台服务器上,并可在单服务器故障时自动对丢失的数据进行恢复.EH*p采用数据桶满后立即分裂的扩展方法,直接把记录关键字映射到服务器地址,并且把数据桶的分裂和恢复操作分配给系统中的各服务器,克服了LH*类数据结
构的不足之处.实验显示,该结构的备份数据的存储消耗较小,而且单次查询所花费的消息数接近理论最小值2.  相似文献   

5.
可预测扩展并行性能的并行程序设计模型   总被引:1,自引:0,他引:1  
BSP(Bulk-Synchronous)模型是独立于并行体系结构的,即可作为并行计算模型又可看作并地程序设计模型,该模型使程序员在算法设计阶段和编程调试阶段可精确地分析和预测并行程序性能。BSP程序可移植性强,可在多种并行系统发PVM,MPI等上实现。  相似文献   

6.
一种高可扩展存储网络系统TH-MSNS的研究与实现   总被引:2,自引:1,他引:2  
网络存储系统对海量信息的存储与处理、数据的可伸缩性访问与可用以及数据的服务质量与存储安全等都具有重要意义.该文基于FCP设计并实现了一种可扩展存储区域网络系统TH—MSNS,该系统可通过双HBA卡增加带宽和可用性,通过双I/O节点机增加可靠性和可用性,通过多I/O节点扩展容量至260TB等.该文介绍了TH—MSNS的体系结构、SCSI目标模拟器、嵌入式操作系统EOS和存储管理的设计技术与实现方法.该系统SCSI目标模拟器采用分层设计并提供规范接口,可扩充不同的SCSI设备以及不同的网络连接协议;设计的核心软件在嵌入式操作系统的核心态通过内核模块实现,提高了效率;存储管理软件采用分布式结构,独立于操作系统,实现了对象管理、设备自动发现、访问控制、日志等管理功能.与同类系统相比,该系统具有效率高、扩展方便、易维护和兼容性好的特点.  相似文献   

7.
为适应海量地震数据以及集群并行规模不断增大的趋势,提出了多维度成像空间分解算法.根据大规模集群系统有多个并行层次的特征,首先沿炮检距方向分解成像空间;然后再沿in-line方向继续切分,直到成像空间小于计算节点物理内存;最后在二维地表上以面元为单位分解成像空间.算法实现上,共炮检距成像空间映射到计算节点组上,计算节点内的CPU核之间按照round-robin均分面元.该并行算法在不增加数据通信量的情况下,降低了内存的需求,减少了通信开销和同步时间,提高了数据的局部性.实际资料测试表明,该并行算法比传统的输出并行和输入并行算法具备更好的性能与可扩展性,实验作业调度多达497个节点、7552个线程,仍然具备较好的加速效果.  相似文献   

8.
高速互连网络是高性能计算系统的重要组成部分.随着网络规模需求的扩大,如何搭建更大规模的网络是高速互连网络拓扑结构设计的关键.因此,提出一种新型层次化的拓扑结构Paleyfly(PF),其结合了Paley图强正则的特性和Random Regular(RR)图支持任意规模大小的特点.相比其他新型高速互连网络拓扑结构,Paleyfly能够有效解决在路由芯片端口数受限的背景下,Dragonfly(DF)可扩展性受限、Fat tree(Ft)物理成本高、RR结构物理布局难、路由表规模大等问题.同时,根据强正则属性在路由策略上负载均衡的优势,提出了4种路由策略来解决网络的拥塞问题.最后,通过模拟器实验比较分析PF结构与其他拓扑结构及PF结构不同路由策略的性能,验证了PF结构在不同规模以及不同通信模式配置下网络延迟优于RR结构.  相似文献   

9.
Continuing improvements in CPU and GPU performances as well as increasing multi-core processor and cluster-based parallelism demand for flexible and scalable parallel rendering solutions that can exploit multipipe hardware accelerated graphics. In fact, to achieve interactive visualization, scalable rendering systems are essential to cope with the rapid growth of data sets. However, parallel rendering systems are non-trivial to develop and often only application specific implementations have been proposed. The task of developing a scalable parallel rendering framework is even more difficult if it should be generic to support various types of data and visualization applications, and at the same time work efficiently on a cluster with distributed graphics cards. In this paper we introduce a novel system called Equalizer, a toolkit for scalable parallel rendering based on OpenGL which provides an application programming interface (API) to develop scalable graphics applications for a wide range of systems ranging from large distributed visualization clusters and multi-processor multipipe graphics systems to single-processor single-pipe desktop machines. We describe the system architecture, the basic API, discuss its advantadges over previous approaches, present example configurations and usage scenarios as well as scalability results.  相似文献   

10.
可扩展并行计算机系统结构和发展现状   总被引:5,自引:0,他引:5  
Scalable parallel computer is becoming a trend in developing parallel computers. Scalable computers are classified into three system models: the Symmetric Multiprocessor, the Massively Parallel Processor and the Cluster of Workstation. In this paper, the three models are discussed and Dawn parallel computers which belong to MPP and COW models are introduced.  相似文献   

11.
12.
针对组播通信协议中所使用的成员协议的伸缩性差的问题,提出了一种新的随机成员协议(RMP).RMP通过使用随机的响应组成员的加入请求,建立一个每个节点仅仅维护logcN个其它成员信息的连接图,并可以为可靠的报文扩散提供基础.文中对RMP的算法在数学上进行了分析,并通过仿真进行验证,结果表明,RMP是一种具有很强可伸缩性的成员协议.  相似文献   

13.
针对组播通信协议中所使用的成员协议的伸缩性差的问题,提出了一种新的随机成员协议(RMP)。RMP通过使用随机的响应组成员的加入请求,建立一个每个节点仅仅维护logN个其它成员信息的连接图,并可以为可靠的报文扩散提供基础。文中对RMP的算法在数学上进行了分析,并通过仿真进行验证,结果表明,RMP是一种具有很强可伸缩性的成员协议。  相似文献   

14.
In this paper we present a new parallel algorithm for the solution of the incompressible two- and three-dimensional Navier-Stokes equations. The parallelization is achieved via domain decomposition. The computational region is considered in the form of a 2-D or 3-D periodic box decomposed into parallel strips (slabs). For time discretization we use a third order multistep method of [11]. The time discretization procedure results in solving global elliptic problems of (monotonic) Helmholtz and Poisson types in each time step. For the space discretization we employ the multidomain local Fourier (MDLF) method that was developed in [9, 10, 13]. The discretization in the periodic directions is performed by the standard Fourier method. In the direction across the strips we use the Local Fourier Basis technique which involves the overlapping of the neighboring subdomains and smoothing of local functions across the interior boundaries (interfaces). The matching of the local solutions is performed by adding properly weighted interface Green's functions. Their amplitudes are found in terms of the jumps of the solution and its first derivatives at the interfaces. The present paper extends the results of our previous works [1, 9, 10, 13] on parallel use of the MDLF method in three-fold aspects: 1. In [1] a model Navier-Stokes type system was considered which does not include the pressure term. Correspondingly, in each time step only the Helmholtx type equations were solved. It was shown that the parallel solution of this equation can be accomplished using only local (neighbor-to-neighbor) communication due to localization properties of the Helmholtz operator. We consider the complete Navier-Stokes system including the pressure term. The solution of the Poisson equation for pressure has the potential to degrade the performance and the achieved speedup of a parallel algorithm due to the global nature of this equation that necessitates global communication among the processors. However, we show that only a few lowest harmonics require for the global data transfer whereas the rest of harmonics can be treated locally. Therefore, most of the communication that is required for parallelization of the Navier-Stokes solver using the MDLF method is mainly local between adjacent subdomains (processors). Moreover, the percentage of the time spent in global communication reduces as the size of the problem increases. Thus, the present parallel algorithm is highly scalable. 2. In [l] we considered only 2-D equations. In this paper we extend the previous technique to 3-D problems. 3. Previously, the MDLF solver was implemented only on the MEIKO parallel machine. In this paper the 2-D and 3-D Navier-Stokes solvers are implemented on three MIMD message-passing multiprocessors (a 60-processors IBM SP2, a 20-processors MOSIX [3], and a network of 10 Alpha workstations) and achieve an efficiency of more than 70% to 95%. The same code written with the PVM (parallel virtual machine [7]) software package was executed on all the above distinct computational platforms. Detailed performance results, which include scalability analysis, are presented. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

15.
RHiNET是用于构建高性能分布式并行计算系统的光互联网络 ,它由协议、网络接口、交换机和光链路四部分组成 ,有三代实验产品面世。在全面介绍以上各部分的结构、特点后 ,还与其它一些高性能互联网络和标准进行了比较。  相似文献   

16.
17.
Tools for performance monitoring and analysis become indispensable parts of programming environments for parallel computers. As the number of processors increases, the conventional techniques for monitoring the performance of parallel programs will produce large amounts of data in the form of event trace files. On the other hand, this wealth of information is a problem for the programmer who is forced to navigate through it, and for the tools that must store and process it. What makes this situation worse is that most of the time, a large amount of the data are irrelevant to understanding the performance of an application. In this paper, we present a new approach for collecting performance data. By tracing all the events but storing only the statistics of the performance, our approach can provide accurate and useful performance information yet require far less data to be stored. In addition, this approach also supports real-time performance monitoring.  相似文献   

18.
一种高度并行的多任务并行绘制系统结构   总被引:2,自引:0,他引:2  
随着计算机图形技术的实用化,需要构造更逼真、更精细的三维复杂场景,其数据规模日益膨胀,加上对场景的实时交互的要求也越来越高,人们对多屏幕高分辨率显示的需求与日俱增,迫切需要一种针对大规模复杂场景的多任务并行图形绘制系统。本文介绍了一种适用于大规模复杂场景的高度并行的多任务多屏幕并行图形绘制系统的体系结构,支持图形任务的并行化处理和多屏幕显示。该系统结构将几何计算任务与图形绘制任务相分离,分剐进行并行化处理,在计算节点按绘制对象类型对任务进行分类以便于并行计算和任务分配,在绘制节点对各个小块屏幕图形进行并行合成。实验测试结果表明,该系统结构对多任务具有较好的并行效率和可扩展性,能够充分利用系统的并行计算资源,达到较好的绘制效果。  相似文献   

19.
The adders are the vital arithmetic operation for any arithmetic operations like multiplication, subtraction, and division. Binary number additions are performed by the digital circuit known as the adder. In VLSI (Very Large Scale Integration), the full adder is a basic component as it plays a major role in designing the integrated circuits applications. To minimize the power, various adder designs are implemented and each implemented designs undergo defined drawbacks. The designed adder requires high power when the driving capability is perfect and requires low power when the delay occurred is more. To overcome such issues and to obtain better performance, a novel parallel adder is proposed. The design of adder is initiated with 1 bit and has been extended up to 32 bits so as verify its scalability. This proposed novel parallel adder is attained from the carry look-ahead adder. The merits of this suggested adder are better speed, power consumption and delay, and the capability in driving. Thus designed adders are verified for different supply, delay, power, leakage and its performance is found to be superior to competitive Manchester Carry Chain Adder (MCCA), Carry Look Ahead Adder (CLAA), Carry Select Adder (CSLA), Carry Select Adder (CSA) and other adders.  相似文献   

20.
蒋筱斌  熊轶翔  张珩  武延军  赵琛 《软件学报》2023,34(4):1977-1996
现阶段,随着数据规模扩大化和结构多样化的趋势日益凸现,如何利用现代链路内链的异构多协处理器为大规模数据处理提供实时、可靠的并行运行时环境,已经成为高性能以及数据库领域的研究热点.利用多协处理器(GPU)设备的现代服务器(multi-GPU server)硬件架构环境,已经成为分析大规模、非规则性图数据的首选高性能平台.现有研究工作基于Multi-GPU服务器架构设计的图计算系统和算法(如广度优先遍历和最短路径算法),整体性能已显著优于多核CPU计算环境.然而,这类图计算系统中,多GPU协处理器间的图分块数据传输性能受限于PCI-E总线带宽和局部延迟,导致通过增加GPU设备数量无法达到整体系统性能的类线性增长趋势,甚至会出现严重的时延抖动,进而已无法满足大规模图并行计算系统的高可扩展性要求.经过一系列基准实验验证发现,现有系统存在如下两类缺陷:(1)现代GPU设备间数据通路的硬件架构发展日益更新(如NVLink-V1,NVLink-V2),其链路带宽和延迟得到大幅改进,然而现有系统受限于PCI-E总线进行数据分块通信,无法充分利用现代GPU链路资源(包括链路拓扑、连通性和路由);(2)在...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号