首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Coupling multiple computing nodes for transaction processing has become increasingly attractive for reasons of capacity, cost, and availability. This paper presents a comparison of robustness (in terms of performance) of three different architectures for transaction processing. In the shared nothing (SN) architecture, neither disks nor memories are shared. In the shared disk (SD) architecture, all disks are accessible from all nodes, whereas in the shared intermediate memory (SIM) architecture, a shared intermediate level of memory is introduced. Coupling multiple nodes inevitably introduces certain interferences and overheads, which take on different forms and magnitudes under the different architectures. Affinity clustering, which attempts to partition the transactions into affinity clusters according to their database reference patterns, can be employed to reduce the coupling degradation under the different architectures, though in different ways. However, the workload may not be partitionable into N affinity clusters of equal size, where N is the number of nodes in the coupled system, so that the load can be evenly spread over all nodes. In addition to balancing the load, we need to maintain a large fraction of data references within the database affiliated with the affinity cluster. These become increasingly harder to achieve for large values of N. In this paper, we examine the impact of affinity on the performance of these three different coupling architectures  相似文献   

2.
A shared disks (SD) cluster couples multiple computing nodes for high performance transaction processing, and all nodes share a common database at the disk level. In the SD cluster, a front-end router selects a node for an incoming transaction to be executed. An affinity-based routing can increase the buffer hit ratio of each node by clustering transactions referencing similar data to be executed on the same node. However, the affinity-based routing is non-adaptive to the changes of the system load. This means that a specific node would be overloaded if corresponding transactions rush into the system. In this paper, we propose a new transaction routing algorithm, named Dynamic Affinity Cluster Allocation (DACA). DACA can make an optimal balance between the affinity-based routing and indiscriminate sharing of load in the SD cluster. As a result, DACA can increase the buffer hit ratio and reduce the frequency of inter-node buffer invalidations while achieving the dynamic load balancing.  相似文献   

3.
A disk cluster environment (DCE) refers to a distributed architecture for high performance transaction processing in which the computing nodes are locally coupled via a high-speed network and share a common database at the disk level. In the DCE, it is crucial to determine at which node the incoming transactions are processed. This is called transaction routing. The aim of disk sharing in DCE is not only to achieve high performance by distributing the workload among the processing nodes but also to obtain fault-tolerance against possible system failures, like a single node failure. Although a number of transaction routing schemes have been reported for DCE, it is true that most of them are not sufficiently resilient against system dynamics, which inevitably requires changing the routing information. In this paper, we propose a new dynamic transaction routing scheme for DCE, called multicast transaction routing scheme, MTR for short, that is able to change the transaction routing information in the presence of critical events without imposing too much overhead to the transaction processing system. In our scheme, when it is required to change the routing information dynamically, the routing algorithm sends multiple clones of a transaction to a group of candidate processing nodes and selects the processing node that first completes the multicasted transaction as a new processing node for re-routed transaction. The selected processing node is expected to be a best affinity node when the system load is evenly distributed, or a relatively unloaded processing node that is idle enough to process a transaction faster than other nodes. The novel aspect of MTR is that it automatically achieves an optimal balance between affinity-based routing and load balancing. The simulation study shows that MTR rapidly stabilizes the system and produces an optimal routing information so that it finally guarantees faster response time.  相似文献   

4.
Energy management for cluster architectures has become an important design issue. In this paper, we propose a dynamic reconfiguration algorithm, named DRA-SD, to reduce the energy consumption of a real-time shared disk (SD) cluster. DRA-SD consolidates cluster load on a subset of nodes if the quality of service (QoS) is met. Remaining nodes are deactivated so that they can stay at a low-power state. When the load increases again, DRA-SD dynamically activates additional nodes. Unlike previous algorithms proposed for web server clusters, DRA-SD exploits the inherent characteristics of SD cluster to reduce the internode interference and to improve the processing capacity of a given cluster configuration. This enables DRA-SD to meet the QoS constraint while consuming minimal energy. Experiment results show that DRA-SD can save energy significantly under a wide variety of transaction workloads and node characteristics.  相似文献   

5.
The development of database systems with hierarchical hardware architecture is currently a perspective trend in the field of parallel database machines. Hierarchical architectures have been suggested with the aim to combine advantages of shared-nothing architectures and architectures with shared memory and disks. A commonly accepted way of construction of hierarchical systems is to combine shared-memory (shared-everything) clusters in a unique system without shared resources. However, such architectures cannot ensure data accessibility under hardware failures on the processor cluster level, which limits their use in systems with high fault-tolerance requirements. In this paper, an alternative approach to construction of hierarchical systems is suggested. In accordance with this approach, the systems is constructed as an assembly of processor clusters with shared disks, with each cluster being a two-level multiprocessor structure with a standard strongly connected topology of interprocessor connections. A stream model for organization of parallel query processing in systems with the hierarchical architecture suggested is described. This model has been implemented in a prototype parallel database management system Omega designed for Russian multiprocessor computational systems MBC-100/1000. Our experiments show that the total performance of the processor clusters in the Omega system is comparable with that of the processor clusters with shared resources even in the case of great data skew. At the same time, the clusters of the Omega system are capable of ensuring a higher degree of data availability compared to the clusters with shared-memory architectures.  相似文献   

6.
We present a highly available system for environments such as stock trading, where high request rates and low latency requirements dictate that service disruption on the order of seconds in length can be unacceptable. After a node failure, our system avoids delays in processing due to detecting the failure or transferring control to a back-up node. We achieve this by using multiple primary nodes which process transactions concurrently as peers. If a primary node fails, the remaining primaries continue executing without being delayed at all by the failed primary. Nodes agree on a total ordering for processing requests with a novel low overhead wait-free algorithm that utilizes a small amount of shared memory accessible to the nodes and a simple compare-and-swap like protocol which allows the system to progress at the speed of the fastest node. We have implemented our system on an IBM z990 zSeries eServer mainframe and show experimentally that our system performs well and can transparently handle node failures without causing delays to transaction processing. The efficient implementation of our algorithm for ordering transactions is a critically important factor in achieving good performance.  相似文献   

7.
To support a global virtual memory space, an architecture must translate virtual addresses dynamically. In current processors, the translation is done in a TLB (translation lookaside buffer), before or in parallel with the first-level cache access. As processor technology improves at a rapid pace and the working sets of new applications grow insatiably, the latency and bandwidth demands on the TLB are difficult to meet, especially in multiprocessor systems, which run larger applications and are plagued by the TLB consistency problem. We describe and compare five options for virtual address translation in the context of distributed shared memory (DSM) multiprocessors, including CC-NUMAs (cache-coherent non-uniform memory access architectures) and COMAs (cache only memory access architectures). In CC-NUMAs, moving the TLB to shared memory is a bad idea because page placement, migration, and replication are all constrained by the virtual page address, which greatly affects processor node access locality. In the context of COMAs, the allocation of pages to processor nodes is not as critical because memory blocks can dynamically migrate and replicate freely among nodes. As the address translation is done deeper in the memory hierarchy, the frequency of translations drops because of the filtering effect. We also observe that the TLB is very effective when it is merged with the shared-memory, because of the sharing and prefetching effects and because there is no need to maintain TLB consistency. Even if the effectiveness of the TLB merged with the shared memory is very high, we also show that the TLB can be removed in a system with address translation done in memory because the frequency of translations is very low.  相似文献   

8.
持久性内存(persistmemory,PM)具有非易失、字节寻址、低时延和大容量等特性,打破了传统内外存之间的界限,对现有软件体系结构带来颠覆性影响.但是,当前PM硬件还存在着磨损不均衡、读写不对称等问题,特别是当跨NUMA(nonuniformmemoryaccess)节点访问PM时,存在着严重的I/O性能衰减问题.提出了一种NUMA感知的PM存储引擎优化设计,并应用到中兴新一代数据库系统GoldenX中,显著降低了数据库系统跨NUMA节点访问持久内存的开销.主要创新点包括:提出了一种DRAM+PM混合内存架构下跨NUMA节点的数据空间分布策略和分布式存取模型,实现了PM数据空间的高效使用;针对跨NUMA访问PM的高开销问题,提出了I/O代理例程访问方法,将跨NUMA访问PM开销转化为一次远程DRAM内存拷贝和本地访问PM的开销,设计了Cache Line Area (CLA)缓存页机制,缓解了I/O写放大问题,提升了本地访问PM的效率;扩展了传统表空间概念,让每个表空间既拥有独立的表数据存储,也拥有专门的WAL (write-ahead logging)日志存储,针对该分布式WA...  相似文献   

9.
This paper presents a new distributed disk-array architecture for achieving high I/O performance in scalable cluster computing. In a serverless cluster of computers, all distributed local disks can be integrated as a distributed-software redundant array of independent disks (ds-RAID) with a single I/O space. We report the new RAID-x design and its benchmark performance results. The advantage of RAID-x comes mainly from its orthogonal striping and mirroring (OSM) architecture. The bandwidth is enhanced with distributed striping across local and remote disks, while the reliability comes from orthogonal mirroring on local disks at the background. Our RAID-x design is experimentally compared with the RAID-5, RAID-10, and chained-declustering RAID through benchmarking on a research Linux cluster at USC. Andrew and Bonnie benchmark results are reported on all four disk-array architectures. Cooperative disk drivers and Linux extensions are developed to enable not only the single I/O space, but also the shared virtual memory and global file hierarchy. We reveal the effects of traffic rate and stripe unit size on I/O performance. Through scalability and overhead analysis, we find the strength of RAID-x in three areas: 1) improved aggregate I/O bandwidth especially for parallel writes, 2) orthogonal mirroring with low software overhead, and 3) enhanced scalability in cluster I/O processing. Architectural strengths and weakness of all four ds-RAID architectures are evaluated comparatively. The optimal choice among them depends on parallel read/write performance desired, the level of fault tolerance required, and the cost-effectiveness in specific I/O processing applications  相似文献   

10.
Designing multiprocessors based on distributed shared memory (DSM) architecture considerably increases their scalability. But as the number of nodes in a multiprocessor increases, the probability of encountering failures in one or more nodes of the system raises as a serious problem. Thus, every large-scale multiprocessor should be equipped with mechanisms that tolerate node failures. Backward error recovery (BER) is one of the most feasible strategies to build fault tolerant multiprocessors and it can be shown that among various DSM-based architectures, cache only memory architecture (COMA) is the most suitable for implementing BER. The main reason is the existence of built-in mechanisms for data replication in COMA memory system. BER is applicable to COMA multiprocessors with minor hardware redundancy, but it will obviously cause some other kinds of overheads. The most important overhead induced by BER is the time required to produce and store recovery data. This paper introduces an analytical model for predicting the amount of this time overhead and then verifies the correctness of the model through comparing the results predicted from this model with the previously published simulation results. Both the analytical model and simulation results show that the overhead is nearly independent of the number of nodes. The immediate result is that BER is a cost-effective strategy for tolerating node failures in large-scale COMA multiprocessors with large numbers of nodes.  相似文献   

11.
《Real》1996,2(6):383-392
Image processing applications require both computing and communication power. The aim of the GFLOPS project was to study all aspects concerning the design of such computers. The projects' aim was to develop a parallel architectures well as its software environment to implement those applications efficiently. The proposed architecture supports up to 512 processor nodes, connected over a scalable and cost-effective network at a constant cost per node. The first prototype implementation, running since the beginning of 1995, has shown that a parallel system can be both scalable and programmable through the use of a virtually shared memory paradigm, physically implemented with atomic message passing. GFLOPS-2 is a single-user machine which is designed to be used as a low-cost parallel co-processor board in a desk-top work station. In this paper we discuss the design of the GFLOPS-2 machine and evaluate the effectiveness of the mechanisms incorporated. The analysis of the architecture behaviour has been conducted with image processing and general purpose algorithms, written in C and assembler languages, through execution driven simulations. A development environment, especially a C data-parallel language, has been built for this purpose.  相似文献   

12.
在基于事件流的大规模数据密集型系统中,数据可分为事件流数据和事件配置数据两大类,配置数据表示事件流的规则.在shared-nothing结构下,配置数据一般采用全复制的方式分布到各个数据库节点,用于和事件流数据的联合查询.采用全复制的配置数据,修改操作必须在所有节点上进行,数据的一致性控制和多节点的事务处理成为此类数据管理的关键问题.对配置数据的特点及其管理策略进行分析,并成功的在DBroker系统中实现了配置数据一致性控制.  相似文献   

13.
Distributed shared memory for roaming large volumes   总被引:1,自引:0,他引:1  
We present a cluster-based volume rendering system for roaming very large volumes. This system allows to move a gigabyte-sized probe inside a total volume of several tens or hundreds of gigabytes in real-time. While the size of the probe is limited by the total amount of texture memory on the cluster, the size of the total data set has no theoretical limit. The cluster is used as a distributed graphics processing unit that both aggregates graphics power and graphics memory. A hardware-accelerated volume renderer runs in parallel on the cluster nodes and the final image compositing is implemented using a pipelined sort-last rendering algorithm. Meanwhile, volume bricking and volume paging allow efficient data caching. On each rendering node, a distributed hierarchical cache system implements a global software-based distributed shared memory on the cluster. In case of a cache miss, this system first checks page residency on the other cluster nodes instead of directly accessing local disks. Using two Gigabit Ethernet network interfaces per node, we accelerate data fetching by a factor of 4 compared to directly accessing local disks. The system also implements asynchronous disk access and texture loading, which makes it possible to overlap data loading, volume slicing and rendering for optimal volume roaming.  相似文献   

14.
一种基于节点负载的数据动态分区系统,主要考虑节点CPU、内存、带宽负载情况,首先采用二次平滑法预测节点的负载,再结合AHP和熵值指标权重法得到每个节点的处理能力,最后针对不同应用场景动态地调整系统的负载均衡性,提高应用的响应速度;该系统主要包括负载监测采集、预测、数据预分区、数据迁移等模块.由于分布式环境存在节点资源的异构性,为了数据分析计算过程中减少节点之间数据的传输,充分利用节点计算资源,通过负载均衡性提高应用分析的并行计算速度.为此,本文提出一种基于节点负载的数据动态分区机制和策略来改善系统负载均衡性及提高应用的响应速度,辅助相关工作人员完成决策.本论文结合Spark和Elasticsearch集成的数据分析应用场景进行测试.  相似文献   

15.
对内容分发网络(CDN)和对等网络(P2P)分别进行了分析对比,指出了它们各自的优缺点,并根据电信运营商主动参与P2P网络(P4P)技术的特点,给出了一种结合P4P、P2P与CDN技术的混合系统的设计方案,以及混合系统中协助CDN节点分发内容节点(伪CDN节点)的选择算法.该算法利用P4P技术获得运营商提供的网络信息,选择合适的边缘节点,贡献出其容量和带宽,为其他节点服务,以减少了系统边缘代理服务器的数量,增大系统容量,同时减少网络骨干网上的负载.模拟实验分析了考虑底层网络情况后,系统在链路花费、时间花费上的改进,结果表明该算法减少了跨网络运营商(ISP)流量,提高了系统性能.  相似文献   

16.
针对无线传感器网络的加密体制进行了研究;首先,提出了一种基于六边形网格组的传感器网络节点的部署策略,以获得传感器节点的最佳分布;其次,将密钥空间定义为采用Blundo模型生成的一个t次二元多项式f(x,y)的全部密钥的集合;然后通过密钥预分配阶段将密钥材料分配给每个节点,通过直接密钥建立阶段使得每个传感器节点找到与其相邻节点的共享密钥空间;如果两个相邻节点之间没有共享密钥空间,则通过间接密钥建立阶段使得一个或多个中间节点建立起一个路径密钥,从而完成共享密钥空间的建立;仿真实验结果表明,提出的对称密钥预分配模型不仅具有良好的加密性能,而且相比于其他模型的密钥方案有更好的内存开销、运行时间和节点受损攻击时的网络恢复能力。  相似文献   

17.
As warehouse data volumes expand, single-node solutions can no longer analyze the immense volume of data. Therefore, it is necessary to use shared nothing architectures such as MapReduce. Inter-node data segmentation in MapReduce creates node connectivity issues, network congestion, improper use of node memory capacity and inefficient processing power. In addition, it is not possible to change dimensions and measures without changing previously stored data and big dimension management. In this paper, a method called Atrak is proposed, which uses a unified data format to make Mapper nodes independent to solve the data management problem mentioned earlier. The proposed method can be applied to star schema data warehouse models with distributive measures. Atrak increases query execution speed by employing node independence and the proper use of MapReduce. The proposed method was compared to established methods such as Hive, Spark-SQL, HadoopDB and Flink. Simulation results confirm improved query execution speed of the proposed method. Using data unification in MapReduce can be used in other fields, such as data mining and graph processing.  相似文献   

18.
Distributed Shared-Memory (DSM) systems are shared-memory multiprocessor architectures in which each processor node contains a partition of the shared memory. In hybrid DSM systems coherence among caches is maintained by a software-implemented coherence protocol relying on some hardware support. Hardware support is provided to satisfy every node hit (the common case) and software is invoked only for accesses to remote nodes.In this paper we compare the design and performance of four hybrid distributed shared memory (DSM) organizations by detailed simulation of the same hardware platform. We have implemented the software protocol handlers for the four architectures. The handlers are written in C and assembly code. Coherence transactions are executed in trap and interrupt handlers. Together with the application, the handlers are executed in full detail in execution-driven simulations of six complete benchmarks with coarse-grain and fine-grain sharing. We relate our experience implementing and simulating the software protocols for the four architectures.Because the overhead of remote accesses is very high in hybrid systems, the system of choice is different than for purely hardware systems.  相似文献   

19.
Processor-embedded disks, or smart disks, with their network interface controller, can in effect be viewed as processing elements with on-disk memory and secondary storage. The data sizes and access patterns of today's large I/O-intensive workloads require architectures whose processing power scales with increased storage capacity. To address this concern, we propose and evaluate disk-based distributed smart storage architectures. Based on analytically derived performance models, our evaluation with representative workloads show that offloading processing and performing point-to-point data communication improve performance over centralized architectures. Our results also demonstrate that distributed smart disk systems exhibit desirable scalability and can efficiently handle I/O-intensive workloads, such as commercial decision support database (TPC-H) queries, association rules mining, data clustering, and two-dimensional fast Fourier transform, among others.  相似文献   

20.
设计并实现了一个大容量、可扩展、高性能和高可靠性的网络虚拟存储系--BW-VSDS.和其他网络存储系统对比,它有如下的特点:1)采用带内元数据管理和带外数据访问的虚拟存储管理架构,存储管理更灵活,并且系统扩展性更好;2)在单个节点内部的多个虚拟卷、多个虚拟池和多个网络存储设备上利用存储虚拟化技术重构得到面向多种存储应用的网络虚拟存储设备,实现了3层的层次化存储虚拟化模型,对内共享存储设备的容量和带宽,对外提供不同属性的虚拟磁盘;3)采用写时按需分配策略提高了存储空间的利用率,使用数据块重组提高了I/O读写性能;4)使用设备链表和位图实现了层叠式虚拟快照,支持增量快照、写时拷贝和写时重定向机制,实现源卷和快照卷的数据共享;5)提出结合带外存储虚拟化管理的后端集中的带外冗余管理结构,数据读写直接访问存储节点,冗余管理节点在磁盘上以日志方式缓存从存储节点镜像写的数据,然后在后台进行RAID5冗余计算,提高了活跃数据的可靠性,减轻了冗余计算对写性能的影响.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号