共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Distributed shared memory (DSM) systems provide a simple programming paradigm for networks of workstations, which are gaining popularity due to their cost-effective high computing power. However, DSM systems usually exhibit poor performance due to the large communication delay between the nodes; and a lot of different memory consistency models have been proposed to mask the network delay. In this paper, we propose an asynchronous protocol for the release consistent memory model, which we call an Asynchronous Release Consistency (ARC) protocol. Unlike other protocols where the communication adheres to the synchronous request/receive paradigm, the ARC protocol is asynchronous, such that the necessary pages are broadcast before they are requested. Hence, the network delay can be reduced by proper prefetching of necessary pages. We have also compared the performance of the ARC protocol with the lazy release protocol by running standard benchmark programs; and the experimental results showed that the ARC protocol achieves a performance improvement of up to 29%. 相似文献
3.
分布式共享存储系统在分布式存储器的基础上构造逻辑上的共享存储模型。提出了在操作系统层实现分布式共享存储的系统框架,并以Linux操作系统为平台介绍了其实现。该系统提供简单的调用接口,并与Linux内存管理框架紧密结合。通过采用合适的DSM一致性协议提高了整体性能。 相似文献
4.
本文在基于排队论M/M/1动态负载均衡模型的基础上,提出了一种基于纳什均衡的动态负载均衡和静态负载均衡相结合的负载均衡方案.将改进后的方法与原负载均衡模型作对比,结果表明,在系统高通信开销时,新方案能有较好的性能表现,当系统负载量超过45%时,可以取得较好的期望响应时间. 相似文献
5.
顾治华 《计算机工程与科学》1995,17(4):9-17
本文阐述了分布式计算机系统的负荷平衡问题。在能够达到负荷完全平衡和负荷只能近似地达到平衡的两种情况下,本文提出了时间复杂性为O(n)的负荷平衡算法。最后还讨论了具有不同处理机速度的负荷平衡问题以及求解平衡的算法。 相似文献
6.
软件DSM系统的并行调试环境已经成为制约其广泛应用的一个重要因素,重放方法使得用户能用循环调试技术来调试具有执行不确定性的软件DSM程序,本文定义了软件DSM程序执行的happen-before-1关系,并依据其提出一种在软件DMS系统JIAJIA上实现重放的方法,实际应用测试表明,该方法产生很小的空间和时间开销。 相似文献
7.
虽然DSM系统相互之间差异很大,但DSM存在一个共同特征,即提供共享存储抽象机制。本文分析了DSM系统共享存储抽象机制的实现,总结了各种不同的实现途径、实现细节及各自的优缺点,指出了DSM发展的趋势及一些亟待解决的问题。 相似文献
8.
A Distributed Shared Memory (DSM) system provides a distributed application with a shared virtual address space. This article proposes a design for implementing the DSM communication layer on top of the Virtual Interface Architecture (VIA), an industry standard for user‐level networking protocols on high‐speed clusters. User‐level communication protocols operate in user mode, thus removing the operating system kernel's overhead from the critical communication pass, and significantly diminishing communication overhead as a result. We analyze VIA's facilities and limitations in order to ascertain which implementation trade‐offs can be best applied to our development of an efficient communication substrate optimized for DSM requirements. We then implement a multithreaded version of the Home‐based Lazy Release Consistency (HLRC) protocol on top of this substrate. In addition, we compare the performance of this HLRC protocol with that of the Sequential Consistency (SC) protocol in which a Multi View (MV) memory mapping technique was used. This technique enables a fine‐grained access to shared memory, while still relying on the virtual memory hardware to track memory accesses. We perform an ‘apple‐to‐apple’ comparison on the same testbed environment and benchmark suite, and investigate the effectiveness and scalability of both protocols. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献
9.
Due to a significant communication overhead of sending and receiving data, the loop partitioning approaches on distributed memory systems must guarantee not just the computation load balance but computation+communication load balance. The previous approaches in loop partitioning have achieved a communication-free, computation load balanced iteration space partitioning solution for a limited subset of DOALL loops. But a large category of DOALL loops inevitably result in communication and the trade-offs between computation and communication must be carefully analyzed for these loops in order to balance out the combined computation time and communication overheads. In this work, we describe a partitioning approach based on the above motivation for the general cases of DOALL loops. Our goal is to achieve a computation+communication load balanced partitioning through static data and iteration space distribution. Our approach first performs partitioning of iteration and data spaces of a loop nest by analyzing communication and parallelism; it then performs architecture-dependent analysis to adjust the granularity of partitions, load balance each partition with respect to total computation+communication, and then performs mapping of partitions onto the available number of processors. This multiphase partitioning method works as follows. First, the code partitioning phase analyzes the references in the body of the DOALL loop nest and determines a set of directions for reducing a larger degree of communication by trading a lesser degree of parallelism. The partitioning is carried out in the iteration space of the loop by cyclically following a set of direction vectors such that the data references are maximally localized and reused, eliminating a larger communication volume than parallelism. We then perform data space partitioning based on a new larger partition owns rule to minimize the communication overhead for a compute intensive partition by localizing its references relatively more than a smaller noncompute intensive partition. A partition interaction graph is then constructed which is used by the architecture-dependent analysis phase to merge the partitions to achieve granularity adjustment, computation+communication load balance, and mapping on the actual number of available processors. Relevant theory and algorithms are developed along with a performance evaluation on the Cray T3D. 相似文献
10.
Min Seung-Jai Basumallik Ayon Eigenmann Rudolf 《International journal of parallel programming》2003,31(3):225-249
This paper describes compiler techniques that can translate standard OpenMP applications into code for distributed computer systems. OpenMP has emerged as an important model and language extension for shared-memory parallel programming. However, despite OpenMP's success on these platforms, it is not currently being used on distributed system. The long-term goal of our project is to quantify the degree to which such a use is possible and develop supporting compiler techniques. Our present compiler techniques translate OpenMP programs into a form suitable for execution on a Software DSM system. We have implemented a compiler that performs this basic translation, and we have studied a number of hand optimizations that improve the baseline performance. Our approach complements related efforts that have proposed language extensions for efficient execution of OpenMP programs on distributed systems. Our results show that, while kernel benchmarks can show high efficiency of OpenMP programs on distributed systems, full applications need careful consideration of shared data access patterns. A naive translation (similar to OpenMP compilers for SMPs) leads to acceptable performance in very few applications only. However, additional optimizations, including access privatization, selective touch, and dynamic scheduling, resulting in 31% average improvement on our benchmarks. 相似文献
11.
This paper presents an efficient, writer-based logging scheme for recoverable distributed shared memory systems, in which logging of a data item is performed by its writer process, instead of every process that accesses the item logging it. Since the writer process maintains the log of data items, volatile storage can be used for logging. Only the readers' access information needs to be logged into the stable storage of the writer process to tolerate multiple failures. Moreover, to reduce the frequency of stable logging, only the data items accessed by multiple processes are logged with their access information when the items are invalidated, and also semantic-based optimization in logging is considered. Compared with the earlier schemes in which stable logging was performed whenever a new data item was accessed or written by a process, the size of the log and the logging frequency can be significantly reduced in the proposed scheme. 相似文献
12.
A transparent distributed shared memory (DSM) system must achieve complete transparency in data distribution, workload distribution,
and reconfiguration respectively. The transparency of data distribution allows programmers to be able to access and allocate
shared data using the same user interface as is used in shared-memory systems. The transparency of workload distribution and
reconfiguration can optimize the parallelism at both the user-level and the kernel-level, and also improve the efficiency
of run-time reconfiguration. In this paper, a transparent DSM system referred to as Teamster is proposed and is implemented for clustered symmetric multiprocessors. With the transparency provided by Teamster, programmers can exploit all the computing power of the clustered SMP nodes in a transparent way as they do in single SMP
computer. Compared with the results of previous researches, Teamster can realize the transparency of cluster computing and obtain satisfactory system performance. 相似文献
13.
14.
数据竞争是共享存储程序中的一类难于调试的错误 .在支持域存储一致性模型的软件 DSM系统 JIAJIA上 ,通过采用汇编代码装配技术来获得程序所读写的共享变量集合的方法 ,实现了基于锁集合的动态数据竞争检测算法 .利用本文方法 ,在 TSP和 Barnes程序中找到了数据竞争情况 ,并根据找到的数据竞争 ,修正了 Barnes中的错误 .实际使用经验表明 ,本文方法易于用户使用 ,达到了实用水平 相似文献
15.
请求负载均衡,是分布式文件系统元数据管理需要面对的核心问题.以最大化元数据服务器集群吞吐量为目标,在已有元数据管理层之上设计实现了一种分布式缓存框架,专门管理热点元数据,均衡不断变化的负载.与已有的元数据负载均衡架构相比,这种两层的负载均衡架构灵活度更高,对负载的感知能力更强,并且避免了热点元数据重新分布、迁移引起的元数据命名空间结构被破坏的情况.经观察分析,元数据尺寸小、数量大,预取错误元数据带来的代价远远小于预取错误数据带来的代价.针对元数据的以上鲜明特点,提出一种元数据预取策略和基于预取机制的元数据缓存替换算法,加强了上述分布式缓存层的性能,这种两层的元数据负载均衡框架同时考虑了缓存一致性的问题.最后,在一个真实的分布式文件系统中验证了框架及方法的有效性. 相似文献
16.
GSM:一种多服务器系统的动态负载均衡调度模型 总被引:1,自引:0,他引:1
设多服务器系统由 n个服务器结点和 1个中心任务调度结点组成 ,所有服务器具有同样的处理能力 ,任务的到达是一速率参数为 nλ的泊松流 ,任务的服务器处理时间服从参数为 1/μ的指数分布 (λ<μ) ,一次负载信息收集时间和任务调度时间在理想情况下可忽略不计 .基于这些假设 ,提出了多服务器系统的广义超级市场调度模型 (GeneralizedSuperm arket Model,GSM) ,详细分析并证明了该调度模型的解特性 .结论表明 :对于大规模多服务器系统 ,广义超级市场调度模型与随机选择服务调度策略相比 ,可指数级地提高平均任务消耗时间性能 ;与文献 [1,2 ,3]的结果相比 ,达到其极小化平均任务消耗时间 相似文献
17.
分布式数据流处理系统的动态负载平衡技术 总被引:4,自引:0,他引:4
设计了一种新的大规模分布式数据流处理系统的体系结构。系统由一组异构的服务器集群组成,负载在每个服务器集群内部多台同构的服务器之间获得平衡,从而达到整个系统的负载平衡。集群设计的主要目标之一是以资源换性能,服务器集群中服务器的最大数目足够保证系统不再发生过载现象,不再需要会降低性能的卸载技术。而且投入运行的服务器的数目根据实际的系统负载来决定,负载较轻时,一部分服务器可以进入休眠状态来减少能源的消耗。根据系统动态增减服务器的特点,设计了全新的初始化算法、动态负载平衡算法。与以前的分布式数据流处理系统相比,由于单个集群的服务器的数目大大减少,算法复杂性降低、速度加快、优化的空间增大。 相似文献
18.
分布式和并行系统的负载平衡是影响系统性能的一个重要因素,本文提出了一个基于预测的动态负载平衡算法,本算法以本地负载信息为基础预测该结点达到空闲状态的时间,并且在该结点到达空闲状态之前发出任务请求,从而保证系统中各结点都处于忙碌状态,提高系统资源的利用率,提高系统性能。 相似文献
19.
MongoDB作为一种新兴的NoSQL数据库,以其模式自由、文档式存储、故障自动恢复、良好的水平扩展、自动负载均衡等特点深受国内外市场的青睐.MongoDB自带的负载均衡策略能使各个节点数据量达到平衡.但是在实际的生产环境中,节点之间数据访问热度不同也会导致负载失衡,特别是出现节点过热的情况.针对这一问题,引入Markov随机过程,提出一种基于Markov预测模型的负载均衡策略,根据Markov模型的稳态概率向量预测各个分片的负载并进行数据迁移.通过实验,验证了当各个分片间出现节点过热时,所提出的负载均衡策略能够很好地使分片间的负载达到基于访问热点的均衡. 相似文献