首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Geographically replicating popular objects in the Internet speeds up content distribution at the cost of keeping the replicas consistent and up-to-date. The overall effectiveness of replication can be measured by the total communication cost consisting of client accesses and consistency management, both of which depend on the locations of the replicas. This paper investigates the problem of placing replicas under the widely used TTL-based consistency scheme. A polynomial-time algorithm is proposed to compute the optimal placement of a given number of replicas in a network. The new replica placement scheme is compared, using real Internet topologies and Web traces, against two existing approaches which do not consider consistency management or assume invalidation-based consistency scheme. The factors affecting their performance are identified and discussed  相似文献   

2.
The expiration-based scheme is widely used to manage the consistency of cached and replicated contents such as Web objects. In this approach, each replica is associated with an expiration time beyond which the replica has to be validated. While the expiration-based scheme has been investigated in the context of a single replica, not much work has been done on its behaviors with respect to multiple replicas. To allow for efficient consistency management, it is desirable to organize the replicas into a distribution tree where a lower level replica seeks validation with a higher level replica when its lifetime expires. This paper investigates the construction of a distribution tree for a given set of replicas with the objective of minimizing the total communication cost of consistency management. This is formulated as an optimization problem and is proven to be NP-complete. The optimal distribution tree is identified in some special cases and several heuristic algorithms are proposed for the general problem. The performance of the heuristic algorithms is experimentally evaluated against two classical graph-theoretic algorithms of tree construction: the shortest-paths tree and the minimum spanning tree.  相似文献   

3.
Due to the large difference between seek time and transfer time in current disk technology, it is advantageous to perform large I/O using a single sequential access rather than multiple small random I/O accesses. However, prior optimal cost and data placement approaches for processing range queries over two-dimensional datasets do not consider this property. In particular, these techniques do not consider the issue of sequential data placement when multiple I/O blocks need to be retrieved from a single device. In this paper, we reevaluate the optimal cost of range queries by declustering two-dimensional datasets over multiple devices, and prove that, in general, it is impossible to achieve the new optimal cost. This is because disks cannot facilitate two-dimensional sequential access which is required by the new optimal cost. Then we revisit the existing data allocation schemes under the new optimal cost, and show that none of them can achieve the new optimal cost. Fortunately, MEMS-based storage is being developed to reduce I/O cost. We first show that the two-dimensional sequential access requirement can not be satisfied by simply modeling MEMS-based storage as conventional disks. Then we propose a new placement scheme that exploits the physical properties of MEMS-based storage to solve this problem. Our theoretical analysis and experimental results show that the new scheme achieves almost optimal I/O costs. Recommended by: Sunil Prabhakar This research is supported by the NSF grants under IIS-0220152 and CNF-0423336.  相似文献   

4.
结构化P2P系统通常使用数据复制来提高数据可用性,但P2P环境中的节点搅动、多节点并发更新以及恶意节点的存在也为副本的一致性管理带来了新的挑战.基于协商的算法要求节点间以全交换的方式通讯,在P2P环境中其可伸缩性不够理想.本文针对结构化P2P系统提出一种基于Quorum的副本管理算法:使用混合失效模型降低容错开销,利用DHT服务处理节点搅动,将数据存储与其元信息管理分离,使数据可靠性和数据可用性得以独立调整.模拟实验表明该算法可以明显改善系统的可伸缩性,减少系统的容错开销.  相似文献   

5.
Secure Data Objects Replication in Data Grid   总被引:1,自引:0,他引:1  
Secret sharing and erasure coding-based approaches have been used in distributed storage systems to ensure the confidentiality, integrity, and availability of critical information. To achieve performance goals in data accesses, these data fragmentation approaches can be combined with dynamic replication. In this paper, we consider data partitioning (both secret sharing and erasure coding) and dynamic replication in data grids, in which security and data access performance are critical issues. More specifically, we investigate the problem of optimal allocation of sensitive data objects that are partitioned by using secret sharing scheme or erasure coding scheme and/or replicated. The grid topology we consider consists of two layers. In the upper layer, multiple clusters form a network topology that can be represented by a general graph. The topology within each cluster is represented by a tree graph. We decompose the share replica allocation problem into two subproblems: the Optimal Intercluster Resident Set Problem (OIRSP) that determines which clusters need share replicas and the Optimal Intracluster Share Allocation Problem (OISAP) that determines the number of share replicas needed in a cluster and their placements. We develop two heuristic algorithms for the two subproblems. Experimental studies show that the heuristic algorithms achieve good performance in reducing communication cost and are close to optimal solutions.  相似文献   

6.
We propose and analyze an adaptive per-user per-object cache consistency management (APPCCM) scheme for mobile data access in wireless mesh networks. APPCCM supports strong data consistency semantics through integrated cache consistency and mobility management. The objective of APPCCM is to minimize the overall network cost incurred due to data query/update processing, cache consistency management, and mobility management. In APPCCM, data objects can be adaptively cached at the mesh clients directly or at mesh routers dynamically selected by APPCCM. APPCCM is adaptive, per-user and per-object as the decision regarding where to cache a data object accessed by a mesh client is made dynamically, depending on the mesh client’s mobility and data query/update characteristics, and the network’s conditions. We develop analytical models for evaluating the performance of APPCCM and devise a computational procedure for dynamically calculating the overall network cost incurred. We demonstrate via both model-based analysis and simulation validation that APPCCM outperforms non-adaptive cache consistency management schemes that always cache data objects at the mesh client, or at the mesh client’s current serving mesh router for mobile data access in wireless mesh networks.  相似文献   

7.
This paper studies a challenging problem of cache placement in wireless multi-hop ad hoc networks. More specifically, we study how to achieve an optimal tradeoff between total access delay and caching overheads, by properly selecting a subset of wireless nodes as cache nodes when the network topology changes. We assume a data source updates a data item to be accessed by other client nodes. Most of the existing cache placement algorithms use hop counts to measure the total cost of a caching system, but hop delay in wireless networks varies much due to the contentions among these nodes and the traffic load on each link. Therefore, we evaluate the per-hop delay for each link according to the contentions detected by a wireless node from the MAC layer. We propose two heuristic cache placement algorithms, named Centralized Contention-aware Caching Algorithm (CCCA) and Distributed Contention-aware Caching Algorithm (DCCA), both of which detect the variation of contentions and the change of the traffic flows, in order to evaluate the benefit of selecting a node as a cache node. We also apply a TTL-based cache consistency strategy to maintain the delta consistency among all the cache nodes. Simulation results show that the proposed algorithms achieve better performance than other alternative ones in terms of average query delay, caching overheads, and query success ratio.  相似文献   

8.
在Device-to-Device (D2D)缓存网络中,缓存文件的副本数量是制约系统缓存效率的重要因素,过多的副本会导致缓存资源不能得到充分利用,副本数过低又将使流行文件难以被有效获取。针对D2D缓存网络副本布设问题,以系统缓存命中率最大化为目标,利用凸规划理论,提出了一种缓存文件副本数布设算法(CRP)。仿真结果显示,与现有副本数量布设算法相比,该算法可以有效提升D2D缓存网络总体缓存命中率。  相似文献   

9.
To meet the challenges of consistent performance, low communication latency, and a high degree of user mobility, cloud and Telecom infrastructure vendors and operators foresee a Mobile Cloud Network that incorporates public cloud infrastructures with cloud augmented Telecom nodes in forthcoming mobile access networks. A Mobile Cloud Network is composed of distributed cost- and capacity-heterogeneous resources that host applications that in turn are subject to a spatially and quantitatively rapidly changing demand. Such an infrastructure requires a holistic management approach that ensures that the resident applications’ performance requirements are met while sustainably supported by the underlying infrastructure. The contribution of this paper is three-fold. Firstly, this paper contributes with a model that captures the cost- and capacity-heterogeneity of a Mobile Cloud Network infrastructure. The model bridges the Mobile Edge Computing and Distributed Cloud paradigms by modelling multiple tiers of resources across the network and serves not just mobile devices but any client beyond and within the network. A set of resource management challenges is presented based on this model. Secondly, an algorithm that holistically and optimally solves these challenges is proposed. The algorithm is formulated as an application placement method that incorporates aspects of network link capacity, desired user latency and user mobility, as well as data centre resource utilisation and server provisioning costs. Thirdly, to address scalability, a tractable locally optimal algorithm is presented. The evaluation demonstrates that the placement algorithm significantly improves latency, resource utilisation skewness while minimising the operational cost of the system. Additionally, the proposed model and evaluation method demonstrate the viability of dynamic resource management of the Mobile Cloud Network and the need for accommodating rapidly mobile demand in a holistic manner.  相似文献   

10.
Scalable shared-memory multiprocessor systems are typically NUMA (nonuniform memory access) machines, where the exploitation of the memory hierarchy is critical to achieving high performance. Iterative data parallel loops with near-neighbor communication account for many important numerical applications. In such loops, the communication of partial results stresses the memory system performance. In this paper, we develop data placement schemes that minimize communication time where the near-neighbor interaction is determined by a stencil. Under a given loop partition, our compile-time algorithm partitions global data into four classes for each processor, with each class requiring specific consistency maintenance requirements. The ADAPT (Automatic Data Allocation and Partitioning Tool) system was implemented to automatically partition parallel code segments for the BBN TC2000, a scalable shared-memory multiprocessor. ADAPT caches global arrays and maintains data consistency in software through instructions that flush data from private caches. Restructuring of a fluid flow code segment by ADAPT improved performance by a factor of more than 3 on the BBN TC2000. Features in current generation pipelined processors with multiple functional units permit the overlap of memory accesses with computation. Our experiments on the BBN TC2000 show that the degree of overlap is limited by architectural parameters, such as the number of CPU registers.  相似文献   

11.
Replica Placement Strategies in Data Grid   总被引:1,自引:0,他引:1  
Replication is a technique used in Data Grid environments that helps to reduce access latency and network bandwidth utilization. Replication also increases data availability thereby enhancing system reliability. The research addresses the problem of replication in Data Grid environment by investigating a set of highly decentralized dynamic replica placement algorithms. Replica placement algorithms are based on heuristics that consider both network latency and user requests to select the best candidate sites to place replicas. Due to dynamic nature of Grid, the candidate site holds replicas currently may not be the best sites to fetch replicas in subsequent periods. Therefore, a replica maintenance algorithm is proposed to relocate replicas to different sites if the performance metric degrades significantly. The study of our replica placement algorithms is carried out using a model of the EU Data Grid Testbed 1 [Bell et al. Comput. Appl., 17(4), 2003] sites and their associated network geometry. We validate our replica placement algorithms with total file transfer times, the number of local file accesses, and the number of remote file accesses.  相似文献   

12.
数据副本管理是云计算系统管理的重要组成部分,在云计算系统的海量数据处理过程中,针对目前已知的数据存放与资源调度算法存在考虑副本动态性和可靠性的不足,提出了一种动态的副本放置机制。该机制基于区域结构,考虑数据处理时其副本的数量和放置位置,以及副本的产生对于内存和带宽等系统资源的开销:首先根据云存储中的副本信息,对被访问频率高且访问平均响应时间长的数据信息进行复制,并给出副本数量的计算方法;考虑缩小副本分布的节点选择范围,提出动态的副本放置算法——DRA,将一定范围内的节点根据提出的域的划分,进行放置筛选,以存放数据副本。实验结果表明,提出的动态放置机制不仅减少了低访问率副本对系统存储空间的浪费;同时也减少了高访问率副本所需跨节点的传输延迟,有效提高了云存储系统中的数据文件的访问效率、负载的均衡水平,以及云存储系统的可靠性和可用性。  相似文献   

13.
在网络中定位最优复制以最小化通讯代价。假定网络采用read-one-write-all策略来保证网络数据一致性,那么存在一个决定复制定位的最优化问题。提出了研究复制问题中读、写比率以确定最优化通讯代价。问题可转换成一个0- 1线性规划问题,并将此问题扩展为一个P中值问题,可以证明这个问题是NP-complete的问题,并提出了一种多项式时间内的此问题求解算法。  相似文献   

14.
VOD服务器集群中的改进SLF存储调度策略   总被引:2,自引:0,他引:2  
在VOD服务器集群中,存储调度策略是影响整个系统存储容量和总并发数的关键技术之一.针对现有存储调度策略中最小负载优先(SLF)副本放置算法调整代价过高的问题,提出了一种改进SLF算法.算法以最小化负载不平衡度和最小化副本调整代价为目标,在放置过程中充分利用当前已经存储的副本,降低副本调整的代价.仿真实验表明,基于改进SLF算法的存储调度策略可以最小化负载不平衡度,降低了存储调度的调整代价,同时提高了系统的用户请求接受概率.  相似文献   

15.
In a distributed database system, data replicas are placed at different locations to achieve high data availability in the presence of link failures. With a majority voting protocol, a location survives for read/write operations if and only if it is accessible to more than half of the replicas. The problem is to find out the optimal placements for a given number of data replicas in a ring network. When the number of replicas is odd, it was conjectured by Hu et al. [X.-D. Hu, X.-H. Jia, D.-Z. Du, D.-Y. Li, H.-J. Huang, Placement of data replicas for optimal data availability in ring networks, J. Parallel Distrib. Comput., 61 (2001) 1412–1424] that every uniform placement is optimal, which is proved by Shekhar and Wu later. However, when the number of replicas is even, it was pointed out by Hu et al. that uniform placements are not optimal and the optimal placement problem may be very complicated. In this paper, we study the optimal placement problem in a ring network with majority voting protocol and an even number of replicas, and give a complete characterization of optimal placements when the number of replicas is not too large compared with the number of locations.  相似文献   

16.
Minimizing delivery cost in scalable streaming content distribution systems   总被引:1,自引:0,他引:1  
Recent scalable multicast streaming protocols for on-demand delivery of media content offer the promise of greatly reduced server and network bandwidth. However, a key unresolved issue is how to design scalable content distribution systems that place replica servers closer to various client populations and route client requests and response streams so as to minimize the total server and network delivery cost. This issue is significantly more complex than the design of distribution systems for traditional Web files or unicast on-demand streaming, for two reasons. First, closest server and shortest path routing does not minimize network bandwidth usage; instead, the optimal routing of client requests and server multicasts is complex and interdependent. Second, the server bandwidth usage increases with the number of replicas. Nevertheless, this paper shows that the complex replica placement and routing optimization problem, in its essential form, can be expressed fairly simply, and can be solved for example client populations and realistic network topologies. The solutions show that the optimal scalable system can differ significantly from the optimal system for conventional delivery. Furthermore, simple canonical networks are analyzed to develop insights into effective heuristics for near-optimal placement and routing. The proposed new heuristics can be used for designing large and heterogeneous systems that are of practical interest. For a number of example networks, the best heuristics produce systems with total delivery cost that is within 16% of optimality.  相似文献   

17.
Dynamic Web contents are generated by running application programs on base data which often change frequently. Geographically replicating the applications that construct these contents (including the programs and the related data they access) is an effective approach to improve their access latency. To maintain the freshness of an object replica, the new version of the object either has to be fetched from remote servers or be reconstructed locally when the origin copy is updated. We present a theoretical study on geographical replication of dynamic Web contents with the objective of minimizing the consistency management costs in terms of update transfers and object reconstruction. The dependencies among dynamic objects and base data are modeled as a directed acyclic graph. We formulate the minimum cost replication problem under a flat framework of update delivery. The problem is solved by first transforming it into a minimum cut problem in a flow network. A polynomial-time algorithm is then proposed to compute the optimal replication strategy which designates where each object should be replicated and how to keep the replicas up-to-date.  相似文献   

18.
基于集群服务器的容灾系统的副本管理研究   总被引:4,自引:0,他引:4  
提出一种基于集群服务器的容灾系统副本管理方案,提出多个副本的一致性维护和副本选择的算法以及副本数量和分布方式的数学模型。通过容灾系统的性能测试实验,证明它能够实现数据的快速自动恢复,有效地管理副本,并保持副本可靠性和集群服务器性能之间的平衡。  相似文献   

19.
Distributing multiple replicas in geographically-dispersed clouds is a popular approach to reduce latency to users. It is important to ensure that each replica should have availability and data integrity features; that is, the same as the original data without any corruption and tampering. Remote data possession checking is a valid method to verify the replicas?s availability and integrity. Since remotely checking the entire data is time-consuming due to both the large data volume and the limited bandwidth, efficient data-possession-verifying methods generally sample and check a small hash (or random blocks) of the data to greatly reduce the I/O cost. Most recent research on data possession checking considers only single replica. However, multiple replicas data possession checking is much more challenging, since it is difficult to optimize the remote communication cost among multiple geographically-dispersed clouds. In this paper, we provide a novel efficient Distributed Multiple Replicas Data Possession Checking (DMRDPC) scheme to tackle new challenges. Our goal is to improve efficiency by finding an optimal spanning tree to define the partial order of scheduling multiple replicas data possession checking. But since the bandwidths have geographical diversity on the different replica links and the bandwidths between two replicas are asymmetric, we must resolve the problem of Finding an Optimal Spanning Tree in a Complete Bidirectional Directed Graph, which we call the FOSTCBDG problem. Particularly, we provide theories for resolving the FOSTCBDG problem through counting all the available paths that viruses attack in clouds network environment. Also, we help the cloud users to achieve efficient multiple replicas data possession checking by an approximate algorithm for tackling the FOSTCBDG problem, and the effectiveness is demonstrated by an experimental study.  相似文献   

20.
王春凯  孟小峰 《软件学报》2018,29(3):869-882
并行环境下的分布式连接处理要求制定划分策略以减少状态迁移和通信开销。相对于数据库管理系统而言,分布式数据流管理系统中的在线θ连接操作需要更高的计算成本和内存资源。基于完全二部图的连接模型可支持分布式数据流的连接操作。因为连接操作的每个关系仅存放于二部图模型的一侧处理单元,无需复制数据,且处理单元相互独立,因此该模型具有内存高效、易伸缩和可扩展等特性。然而,由于数据流速的不稳定性和属性值分布的不均衡性,导致倾斜数据流的连接操作易出现集群负载不均衡的现象。针对倾斜数据流的连接操作,模型无法动态分配查询节点,并需要人工干预数据分组的参数设置。尤其是应对全部历史数据的连接查询,模型效率更低。基于上述问题,提出了管理倾斜数据流连接的框架,使用基于键值和元组混合的划分样式有效应对二部图模型的各侧倾斜数据。并设计了重新动态分配查询节点的策略和状态迁移算法,以支持全历史数据的连接查询和自适应的资源管理。针对合成数据和真实数据的实验表明,该方案可有效应对倾斜数据的连接操作并进一步提升分布式数据流管理系统的吞吐率,特别是降低云环境中的计算成本。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号