首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
One of the important features of database fragmentation and allocation techniques is the fact that they depend not only on the entries of a database relation, but also on their empirical frequencies of use. Distributed processing is an effective way to improve performance of database systems. However, for a Distributed Database System (DDBS) to function efficiently, fragments of the database need to be allocated carefully at various sites across the relevant communications network. Therefore, fragmentation and proper allocation of fragments across network sites is considered as a key research area in distributed database environment. However, fragments allocation to the most appropriate sites is not an easy task to perform. This paper proposes a synchronized horizontal fragmentation, replication and allocation model that adopts a new approach to horizontally fragment a database relation based on attribute retrieval and update frequency to find an optimal solution for the allocation problem. A heuristic technique to satisfy horizontal fragmentation and allocation using a cost model to minimize the total cost of distribution is developed. Experimental results are consistent with the hypothesis and confirm that the proposed model can efficiently solve dynamic fragmentation and allocation problem in a distributed relational database environment.  相似文献   

2.
社交网络的庞大数据需求分布式存储,多个用户的数据分散存储在各个存储和计算节点上可以保持并行性和冗余性。如何在有限的分布式存储空间内高性能存储和访问用户数据具有现实意义。在当前的社交网络系统中,用户数据之间的读写操作会导致大量跨存储节点的远程访问。减少节点间的远程访问可以降低网络负载和访问延迟,提高用户体验。提出一种基于用户交互行为的动态划分复制算法,利用用户之间的朋友关系和评论行为描述社交网络的结构,周期性划分复制用户数据,从而提高本地访问率,降低网络负载。通过真实数据集验证,该算法相比随机划分和复制算法能够大大提升本地访问率,降低访问延迟。  相似文献   

3.
With the growth of information technology and computer networks, there is a vital need for optimal design of distributed databases with the aim of performance improvement in terms of minimizing the round-trip response time and query transmission and processing costs. To address this issue, new fragmentation, data allocation, and replication techniques are required. In this paper, we propose enhanced vertical fragmentation, allocation, and replication schemes to improve the performance of distributed database systems. The proposed fragmentation scheme clusters highly-bonded attributes (i.e., normally accessed together) into a single fragment in order to minimize the query processing cost. The allocation scheme is proposed to find an optimized allocation to minimize the round-trip response time. The replication scheme partially replicates the fragments to increase the local execution of queries in a way that minimizes the cost of transmitting replicas to the sites. Experimental results show that, on average, the proposed schemes reduce the round-trip response time of queries by 23% and query processing cost by 15%, as compared to the related work.  相似文献   

4.
Distributed database design requires decisions on closely related issues such as fragmentation, allocation, degree of replication, concurrency control, and query processing. We develop an integrated methodology for fragmentation and allocation that is simple and practical and can be applied to real-life problems. The methodology also incorporates replication and concurrency control costs. At the same time, it is theoretically sound and comprehensive enough to achieve the objectives of efficiency and effectiveness. It distributes data across multiple sites such that design objectives in terms of response time and availability for transactions, and constraints on storage space, are adequately addressed. This methodology has been used successfully in designing a distributed database system for a large geographically distributed organization  相似文献   

5.
Secure Data Objects Replication in Data Grid   总被引:1,自引:0,他引:1  
Secret sharing and erasure coding-based approaches have been used in distributed storage systems to ensure the confidentiality, integrity, and availability of critical information. To achieve performance goals in data accesses, these data fragmentation approaches can be combined with dynamic replication. In this paper, we consider data partitioning (both secret sharing and erasure coding) and dynamic replication in data grids, in which security and data access performance are critical issues. More specifically, we investigate the problem of optimal allocation of sensitive data objects that are partitioned by using secret sharing scheme or erasure coding scheme and/or replicated. The grid topology we consider consists of two layers. In the upper layer, multiple clusters form a network topology that can be represented by a general graph. The topology within each cluster is represented by a tree graph. We decompose the share replica allocation problem into two subproblems: the Optimal Intercluster Resident Set Problem (OIRSP) that determines which clusters need share replicas and the Optimal Intracluster Share Allocation Problem (OISAP) that determines the number of share replicas needed in a cluster and their placements. We develop two heuristic algorithms for the two subproblems. Experimental studies show that the heuristic algorithms achieve good performance in reducing communication cost and are close to optimal solutions.  相似文献   

6.
In distributed systems, data may be correlated due to accesses from clients and the correlation has some impact on date placement, and existing research works focus on independent data objects. In this paper, we address both the scalability and the stability of the data placement solutions in internet environment. We first show that replica allocation decisions can be made locally for each replica site in a tree network, with data access knowledge of its neighbors. We then develop a new replication cost model for correlated data objects in Internet environment. Based on the cost model and the algorithms in previous research, we develop a distributed optimal replica allocation algorithm (DOPR) for correlated data in internet environment. A distributed heuristic algorithm (DHPR) is then developed to efficiently make replica placement decisions. The algorithm obtains sub-optimal solutions for the correlated data model and yields significant performance gains. Experimental studies show that the distributed heuristic allocation algorithm significantly outperforms the general frequency-based replication schemes (in which the replication decision of each data object is made based on the number of accesses on that data object).  相似文献   

7.
Data object replication onto distributed servers can potentially alleviate bottlenecks, reduce network traffic, increase scalability, add robustness, and decrease user perceived access time. The decision of selecting data object and server pairs requires solving a constraint optimization problem that in general is NP-complete. In this paper, we abstract the distributed database system as an agent-based model, wherein agents continuously compete for allocation and reallocation of data objects. Each agent aims to replicate objects onto its server such that the communication cost is minimized. However, these agents do not have a global view of the system. Thereby, the optimization process becomes highly localized. Such localized optimization may severely affect the overall system performance. To cope with such localized optimization, we propose a “semi-distributed” axiomatic game theoretical mechanism. The mechanism’s control is unique in its decision making process, wherein all the heavy processing is done on the servers of the distributed system and the central body is only required to take a binary decision: (0) not to replicate or (1) to replicate. The cost model used by the agents in the mechanism for the purpose of identifying beneficial data objects is tailored made so that even though the agents take decisions based on their local knowledge domain, the effect is translated into a system-wide performance enhancement. The proposed mechanism is extensively compared against seven well-known conventional and three game theoretical replica allocation methods, namely, branch and bound, greedy, genetic, data-aware replication, tree inspired bottom-up procedure, tree inspired min-max procedure, Benders’ decomposition based procedure, game theoretical English auction, game theoretical Dutch auction, and game theoretical selfish replication procedure. The experimental setup incorporates GT-ITM, Inet network topology generators, Soccer World Cup 1998 access logs, and NASA Kennedy Space Center access logs to closely mimic the Web in its infrastructure and user access patterns. The experimental results reveal that the proposed technique despite its non-cooperative nature improves the solution quality and reduces the execution time compared to other techniques.  相似文献   

8.
Fragmentation of base relations in distributed database management systems increases the level of concurrency and therefore system throughput for query processing. Algorithms for horizontal and vertical fragmentation of relations in relational, object-oriented and deductive databases exist; however, hybrid fragmentation techniques based on variable bindings appearing in user queries and query-access-rule dependency are lacking for deductive database systems. In this paper, we propose a hybrid fragmentation approach for distributed deductive database systems. Our approach first considers the horizontal partition of base relations according to the bindings imposed on user queries, and then generates vertical fragments of the horizontally partitioned relations and clusters rules using affinity of attributes and access frequency of queries and rules. The proposed fragmentation technique facilitates the design of distributed deductive database systems. Received 4 August 1999 / Revised 30 March 2000 / Accepted in revised form 6 October 2000  相似文献   

9.
The analytic prediction of buffer hit probability, based on the characterization of database accesses from real reference traces, is extremely useful for workload management and system capacity planning. The knowledge can be helpful for proper allocation of buffer space to various database relations, as well as for the management of buffer space for a mixed transaction and query environment. Access characterization can also be used to predict the buffer invalidation effect in a multi-node environment which, in turn, can influence transaction routing strategies. However, it is a challenge to characterize the database access pattern of a real workload reference trace in a simple manner that can easily be used to compute buffer hit probability. In this article, we use a characterization method that distinguishes three types of access patterns from a trace: (1) locality within a transaction, (2) random accesses by transactions, and (3) sequential accesses by long queries. We then propose a concise way to characterize the access skew across randomly accessed pages by logically grouping the large number of data pages into a small number of partitions such that the frequency of accessing each page within a partition can be treated as equal. Based on this approach, we present a recursive binary partitioning algorithm that can infer the access skew characterization from the buffer hit probabilities for a subset of the buffer sizes. We validate the buffer hit predictions for single and multiple node systems using production database traces. We further show that the proposed approach can predict the buffer hit probability of a composite workload from those of its component files.  相似文献   

10.
R. J. Whiddett 《Software》1983,13(4):355-371
This paper introduces a new methodology for building flexible and programmable multiprocessor systems. It is based on a distributed implementation of monitors which is supported by a system-wide address relocation mechanism. This provides the dynamic reallocation of processes and monitors within the system. Methods of implementing the communications subsystem based on a number of common network architectures are presented. An algorithm which effects dynamic resource allocation is presented. Its effect is that groups of processes which are interacting through a monitor will tend to accumulate at various sites in the system in a manner reflecting the dynamic interaction patterns of the program. This is accomplished without there being any centralized control, or even knowledge of the overall program interactions. The feasibility of the system has been investigated using a model system running on a loosely connected dual processor system.  相似文献   

11.
In many application areas, the access pattern is navigational and a large fraction of the accesses are perfect match accesses/queries on one or more words in text strings in the objects. One example of such an application area is XML data stored in object database systems. Such systems will frequently store large amounts of data, and in order to provide the necessary computing power and data bandwidth, a parallel system based on a shared-nothing architecture can be necessary. In this paper, we describe how the signature cache approach can significantly reduce the average object access cost in parallel object database systems.  相似文献   

12.
Considering the existing massive volumes of data processed nowadays and the distributed nature of many organizations, there is no doubt how vital the need is for distributed database systems. In such systems, the response time to a transaction or a query is highly affected by the distribution design of the database system, particularly its methods for fragmentation, replication, and allocation data. According to the relevant literature, from the two approaches to fragmentation, namely horizontal and vertical fragmentation, the latter requires the use of heuristic methods due to it being NP-Hard. Currently, there are a number of different methods of providing vertical fragmentation, which normally introduce a relatively high computational complexity or do not yield optimal results, particularly for large-scale problems. In this paper, because of their distributed and scalable nature, we apply swarm intelligence algorithms to present an algorithm for finding a solution to vertical fragmentation problem, which is optimal in most cases. In our proposed algorithm, the relations are tried to be fragmented in such a way so as not only to make transaction processing at each site as much localized as possible, but also to reduce the costs of operations. Moreover, we report on the experimental results of comparing our algorithm with several other similar algorithms to show that ours outperforms the other algorithms and is able to generate a better solution in terms of the optimality of results and computational complexity.  相似文献   

13.
该文通过对集中式数据库和分布式数据库进行比较,指出集中式数据库在存储能力,访问速度等方面存在的问题,对分布式数据库中的数据复制和分片的概念,原则,方法进行详细的论述,最后,对数据复制和分片要解决的关键问题进行分析。  相似文献   

14.
该文通过对集中式数据库和分布式数据库进行比较,指出集中式数据库在存储能力,访问速度等方面存在的问题,对分布式数据库中的数据复制和分片的概念,原则,方法进行详细的论述,最后,对数据复制和分片要解决的关键问题进行分析。  相似文献   

15.
通过对集中式数据库和分布式数据库进行比较,指出集中式数据库在存储能力、访问速度等方面存在的问题;对分布式数据库中的数据复制和分片的概念、原则、方法进行了详细的论述;最后对数据复制和分片要解决的关键问题进行了分析。  相似文献   

16.
Clustering network sites is a vital issue in parallel and distributed database systems DDBS. Grouping distributed database network sites into clusters is considered an efficient way to minimize the communication time required for query processing. However, clustering network sites is still an open research problem since its optimal solution is NP-complete. The main contribution in this field is to find a near optimal solution that groups distributed database network sites into disjoint clusters in order to minimize the communication time required for data allocation. Grouping a large number of network sites into a small number of clusters effectively increases the transaction response time, results in better data distribution, and improves the distributed database system performance. We present a novel algorithm for clustering distributed database network sites based on the communication time as database query processing is time dependent. Extensive experimental tests and simulations are conducted on this clustering algorithm. The experimental and simulation results show that a better network distribution is achieved with significant network servers load balance and network delay, a minor communication time between network sites is realized, and a higher distributed database system performance is recognized.  相似文献   

17.
Data dependability is an important issue in data Grids. Replication schemes have been widely used in distributed systems to ensure availability and improve access performance. Alternatively, data partitioning schemes (secret sharing, erasure coding with encryption) can be used to provide availability and, in addition, to offer confidentiality protection. In peer-to-peer data Grids, such confidentiality protection is essential since the nodes hosting the data shares may not be trustworthy or may be compromised. However, difficulties in generating new shares and potential security concerns for share reallocation make a pure data partitioning scheme not easily adaptable to dynamic user access patterns. In this paper, we consider combining replication and data partitioning to assure data availability, confidentiality, load balance, and efficient access for data Grid applications. Data are partitioned and shares are dispersed. The shares may be replicated to achieve better performance, load balance, and availability. Models for assessing confidentiality, availability, load balance, and communication cost are developed and used as the metrics to guide placement decisions. Due to the nature of contradicting goals, we model the placement decision problem as a multi-objective problem and use a genetic algorithm to determine solutions that are approximate to the Pareto optimal placement solutions.  相似文献   

18.
Proper data allocation is a key performance factor for an efficient functionality of Distributed Database Systems (DDBSs). Therefore, if data allocation across sites is performed accurately while preserving the issues of communication and site constraints, an optimal solution for DDBS performance in a dynamic distributed environment will be achieved.In this paper, a new dynamic data allocation algorithm for non-replicated DDBS is presented, the proposed Performance Optimality Enhancement Algorithm (POEA) explores and improves some concepts used in previously developed algorithms to reallocates fragments to different sites given the changing data access patterns, time and sites constraints of the DDBS. Moreover, the POEA adopts the shortest path between the old location and the new anticipated location for the transferred fragments when migration decision is made. Experimental results show that POEA has efficiently reduced the transmission cost subsequently minimizing the frequency and the time spent on fragment migration over the network sites resulting to a great improvement in the overall DDBS performance.  相似文献   

19.
This paper makes two contributions. First, we introduce a model for evaluating the performance of data allocation and replication algorithms in distributed databases. The model is comprehensive in the sense that it accounts for I/O cost, for communication cost, and, because of reliability considerations, for limits on the minimum number of copies of the object. The model captures existing replica-management algorithms, such as read-one-write-all, quorum-consensus, etc. These algorithms are static in the sense that, in the absence of failures, the copies of each object are allocated to a fixed set of processors. In modern distributed databases, particularly in mobile computing environments, processors will dynamically store objects in their local database and will relinquish them. Therefore, as a second contribution of this paper, we introduce an algorithm for automatic dynamic allocation of replicas to processors. Then, using the new model, we compare the performance of the traditional read-one-write-all static allocation algorithm to the performance of the dynamic allocation algorithm. As a result, we obtain the relationship between the communication cost and I/O cost for which static allocation is superior to dynamic allocation, and the relationships for which dynamic allocation is superior  相似文献   

20.
NoSQL databases are famed for the characteristics of high scalability, high availability, and high fault-tolerance. So NoSQL databases are used in a lot of applications. The data partitioning strategy and fragment allocation strategy directly affect NoSQL database systems’ performance. The data partition strategy of large, global databases is performed by horizontally, vertically partitioning or combination of both. In the general way the system scatters the related fragments as possible to improve operations’ parallel degree. But the operations are usually not very complicated in some applications, and an operation may access to more than one fragment. At the same time, those fragments which have to be accessed by an operation may interact with each other. The general allocation strategies will increase system’s communication cost during operations execution over sites. In order to improve those applications’ performance and enable NoSQL database systems to work efficiently, these applications’ fragments have to be allocated in a reasonable way that can reduce the communication cost i.e., to minimize the total volume of data transmitted during operations execution over sites. A strategy of clustering fragments based on hypergraph is proposed, which can cluster fragments which were accessed together in most operations to the same cluster. Themethod uses a weighted hypergraph to represent the fragments’ access pattern of operations. A hypergraph partitioning algorithmis used to cluster fragments in our strategy. This method can reduce the amount of sites that an operation has to span. So it can reduce the communication cost over sites. Experimental results confirm that the proposed technique will effectively contribute in solving fragments re-allocation problem in a specific application environment of NoSQL database system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号