期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Load Balancing in a Cluster-Based Web Server for Multimedia Applications

Jiani Guo Bhuyan L.N. 《Parallel and Distributed Systems, IEEE Transactions on》2006,17(11):1321-1334

We consider a cluster-based multimedia Web server that dynamically generates video units to satisfy the bit rate and bandwidth requirements of a variety of clients. The media server partitions the job into several tasks and schedules them on the backend computing nodes for processing. For stream-based applications, the main design criteria of the scheduling are to minimize the total processing time and maintain the order of media units for each outgoing stream. In this paper, we first design, implement, and evaluate three scheduling algorithms, first fit (FF), stream-based mapping (SM), and adaptive load sharing (ALS), for multimedia transcoding in a cluster environment. We determined that it is necessary to predict the CPU load for each multimedia task and schedule them accordingly due to the variability of the individual jobs/tasks. We, therefore, propose an online prediction algorithm that can dynamically predict the processing time per individual task (media unit). We then propose two new load scheduling algorithms, namely, prediction-based least load first (P-LLF) and prediction-based adaptive partitioning (P-AP), which can use prediction to improve the performance. The performance of the system is evaluated in terms of system throughput, out-of-order rate of outgoing media streams, and load balancing overhead through real measurements using a cluster of computers. The performance of the new load balancing algorithms is compared with all other load balancing schemes to show that P-AP greatly reduces the delay jitter and achieves high throughput for a variety of workloads in a heterogeneous cluster. It strikes a good balance between the throughput and output order of the processed media units 相似文献

2.

Dynamic load distribution using anti-tasks and load state vectors

Qin Lu Sau-Ming Lau Kwong-Sak Leung 《Concurrency and Computation》1998,10(14):1251-1269

Polling-based load distribution (LD) algorithms suffer from two weaknesses: (i) load information exchanged during a polling session is confined to the two negotiating nodes only; (ii) as the distributed system grows in size (in terms of the number of constituent nodes), a larger number of polling sessions, and thus a higher amount of network bandwidth consumption and CPU overhead, are needed. We propose a new LD algorithm which is based on anti-tasks and load state vectors. This new algorithm avoids the above weaknesses of polling-based LD algorithms. Anti-tasks are composite agents which travel around a distributed system to facilitate the pairing up of task senders and receivers, as well as the collection and dissemination of load information. Time-stamped load information of processing nodes is stored in load state vectors which, when used together with anti-tasks, encourage mutual sharing of load information among processing nodes. Anti-tasks, which make use of load state vectors to decide their traveling paths, are spontaneously directed towards processing nodes having high transient workload, thus allowing their surplus workload to be relocated quickly. Using simulations, we evaluate the performance of our new algorithm by comparing its performance with a number of well-known polling-based load distribution algorithms. We found that our algorithm provides significant reduction of mean task response time over a large range of system sizes. The cost of achieving this performance gain in terms of CPU overhead and channel bandwidth consumption is generally comparable to the other algorithms we studied. © 1998 John Wiley & Sons, Ltd. 相似文献

3.

Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services

Yi Lu Qiaomin Xie Gabriel Kliot Alan Geller James R. Larus Albert GreenbergAuthor vitae 《Performance Evaluation》2011,68(11):1056-1071

The prevalence of dynamic-content web services, exemplified by search and online social networking, has motivated an increasingly wide web-facing front end. Horizontal scaling in the Cloud is favored for its elasticity, and distributed design of load balancers is highly desirable. Existing algorithms with a centralized design, such as Join-the-Shortest-Queue (JSQ), incur high communication overhead for distributed dispatchers.We propose a novel class of algorithms called Join-Idle-Queue (JIQ) for distributed load balancing in large systems. Unlike algorithms such as Power-of-Two, the JIQ algorithm incurs no communication overhead between the dispatchers and processors at job arrivals. We analyze the JIQ algorithm in the large system limit and find that it effectively results in a reduced system load, which produces 30-fold reduction in queueing overhead compared to Power-of-Two at medium to high load. An extension of the basic JIQ algorithm deals with very high loads using only local information of server load. 相似文献

4.

On fair and optimal multi-source IP-multicast

M. Reza Rahimi Abdul Bais Nima Sarshar 《Computer Networks》2012,56(4):1503-1524

We investigate the problem of maximizing multicast throughput under a fairness constraint. Multiple server nodes wish to communicate to their intended set of client nodes over a shared network infrastructure. Our goal is to devise distributed algorithms to construct multicast sessions, one for each server node, such that (a) the network infrastructure is optimally utilized and (b) the network resources are fairly distributed between multicast sessions, i.e., no individual session claims more than a prescribed share of the network bandwidth resources. We are particularly interested in multi-tree multicast strategies in which every multicast session may contain many multicast trees. We show how the use of multiple trees increases network throughput and the load distribution in the network. We propose a class of round-robin algorithms that are based on successive selection of multicast trees for each multicast session, in a loosely cooperative, yet distributed fashion. Our best algorithm, the Cooperative Shortest Path Tree Packing (CSPTP) algorithm, performs well in a variety of scenarios, ranging from very sparse to dense applications. Through extensive simulations on random networks, we compare the performance of our algorithms with those commonly used in IP-multicast as well as theoretical upper bounds derived from network coding formulations. We show that the CSPTP can improve the throughput, and often achieves about 90% of the theoretical upper bound. 相似文献

5.

Dynamic storage and access load balancing for answering range queries in peer-to-peer networks

Zaher Al Aghbari Ibrahim Kamel Ahmed Mustafa 《Peer-to-Peer Networking and Applications》2011,4(4):391-409

Recently, many applications have used Peer-to-Peer (P2P) systems to overcome the current problems with client/server systems such as non-scalability, high bandwidth requirement and single point of failure. In this paper, we propose an efficient scheme to support efficient range query processing over structured P2P systems, while balancing both the storage load and access load. The paper proposes a rotating token scheme to balance the storage load by placing joining nodes in appropriate locations in the identifier space to share loads with already overloaded nodes. Then, to support range queries, we utilize an order-preserving mapping function to map keys to nodes in order preserving way and without hashing. This may result in an access load imbalance due to non-uniform distribution of keys in the identifier space. Thus, we propose an adaptive replication scheme to relieve overloaded nodes by shedding some load on other nodes to balance the access load. We derive a formula for estimating the overhead of the proposed adaptive replication scheme. In this study, we carry simulation experiments with synthetic data to measure the performance of the proposed schemes. Our simulation experiments show significant gains in both storage load balancing and access load balancing. 相似文献

6.

Autonomic microcell assignment in massively distributed online virtual environments

Bruno Van Den Bossche Bart De Vleeschauwer Tom Verdickt Filip De Turck Bart Dhoedt Piet Demeester 《Journal of Network and Computer Applications》2009,32(6):1242-1256

Distributed virtual environments and massively multiplayer online games in particular have been on the rise for several years now. They offer huge digital environments characterized by tens of thousands of users interacting with each other. Efficiently managing these online worlds requires scalable architectures to distribute the load over multiple servers and maintain a high Quality of Experience (QoE). This need will only increase as online virtual worlds become more and more popular. A traditional approach to improve the scalability of this type of system is to statically partition the virtual world in smaller segments called cells, each assigned to a dedicated server.In this paper a novel approach of dividing the virtual world into even smaller parts called microcells is introduced. Critical in this approach are the algorithms that manage the microcell allocation over the available servers. These algorithms must face a number of challenges and have as a central goal to keep the load experienced by the servers below a given threshold. On one hand, clustering interacting microcells on one server allows to limit the overall load by minimizing the communication overhead. On the other hand, locating too many microcells on one server may cause the load to violate the threshold value, resulting in an overload situation. In this paper we present a number of algorithms that determine the microcell allocation and runtime adaptations of the microcell allocation to optimize the deployment. We evaluate the microcell approach by studying the impact of the microcell size and the number of servers. The efficiency of the algorithms in terms of their ability to decrease the maximum server load and their capability to maintain an ideal deployment in dynamic environments is also studied. 相似文献

7.

一种面向分布式虚拟环境的分层迭代负载平衡算法

王少峰周忠吴威《软件学报》2008,19(9):2471-2482

为了支持大规模用户共享虚拟环境,多服务器结构被应用到分布式虚拟环境系统中,每个服务器负责虚拟环境的一个区域划分.由于用户不可预知的移动和交互,可能会导致某些服务器负载过重.现有的负载平衡算法注重于将负载在服务器间重分配,但引入开销过大,影响系统交互性能.提出一种分层迭代的动态负载平衡算法,以过载区域为中心,分层地选择周围有限数量的区域作为调整目标,将过载部分由内向外迭代地扩散到各层,多次迭代达到负载平衡状态.针对倾斜和聚簇两种典型用户分布的虚拟环境,对算法进行验证并与现有的3种负载平衡算法进行比较.结果表明,该算法可以快速、有效地调整负载并引入较少的开销. 相似文献

8.

一种基于智能体的游戏消息公平处理方法

程卫星郝爱民《计算机科学》2008,35(3):283-288

首先分析了现有网络游戏相关的消息处理方法,然后给出了分布式游戏服务器中一种基于智能体的消息处理结构,在此结构中,智能体处理游戏消息的算法可以实现一种公平的消息处理效果.通过选择离用户较近的服务器创建与用户直接通信的智能体,使得用户与该智能体间的网络时延抖动较小,不需要同步用户和服务器之间的时间就能够从游戏中获得一个相对公平的游戏效果.最后在模拟环境中给出了该算法的实验结果. 相似文献

9.

Online Bicriteria Load Balancing Using Object Reallocation

《Parallel and Distributed Systems, IEEE Transactions on》2009,20(3):379-388

We study the bicriteria load balancing problem on two independent parameters under the allowance of object reallocation. The scenario is a system of $M$ distributed file servers located in a cluster, and we propose three online approximate algorithms for balancing their loads and required storage spaces during document placement. The first algorithm is for heterogeneous servers. Each server has its individual tradeoff of load and storage space under the same rule of selection. The other two algorithms are for homogeneous servers. The second algorithm combines the idea of the first one and the best existing solution for homogeneous servers. Using document reallocation, we obtain a smooth tradeoff curve of the upper bounds of load and storage space. The last one bounds the load and storage space of each server by less than three times of their trivial lower bounds, respectively; and more importantly, for each server, the value of at least one parameter is far from its worst case. The time complexities of these three algorithms are $O(log M)$ plus the cost of document reallocation. 相似文献

10.

An SSL Back-End Forwarding Scheme in Cluster-Based Web Servers

Jin-Ha Kim Gyu Sang Choi Das C.R. 《Parallel and Distributed Systems, IEEE Transactions on》2007,18(7):946-957

State-of-the-art cluster-based data centers consisting of three tiers (Web server, application server, and database server) are being used to host complex Web services such as e-commerce applications. The application server handles dynamic and sensitive Web contents that need protection from eavesdropping, tampering, and forgery. Although the secure sockets layer (SSL) is the most popular protocol to provide a secure channel between a client and a cluster-based network server, its high overhead degrades the server performance considerably and, thus, affects the server scalability. Therefore, improving the performance of SSL-enabled network servers is critical for designing scalable and high-performance data centers. In this paper, we examine the impact of SSL offering and SSL-session-aware distribution in cluster-based network servers. We propose a back-end forwarding scheme, called ssl_with_bf, that employs a low-overhead user-level communication mechanism like virtual interface architecture (VIA) to achieve a good load balance among server nodes. We compare three distribution models for network servers, round robin (RR), ssl_with_session, and ssl_with_bf, through simulation. The experimental results with 16-node and 32-node cluster configurations show that, although the session reuse of ssl_with_session is critical to improve the performance of application servers, the proposed back-end forwarding scheme can further enhance the performance due to better load balancing. The ssl_with_bf scheme can minimize the average latency by about 40 percent and improve throughput across a variety of workloads. 相似文献

11.

Self-configurable border landmark selection in wireless networks: Algorithms and applications

Chong Wang Hongyi Wu 《Pervasive and Mobile Computing》2010,6(1):128-143

The nodes at the border of the self-configurable wireless network are commonly employed as landmarks for many applications, including infrastructureless localization, border detection, and routing. However, how to identify the best set of nodes as such landmarks is still an open problem. In this paper, we propose three algorithms for border landmark selection, namely: the Convex Hull-Based (CHB) algorithm, the Center Node Elimination (CNE) algorithm, and the Hierarchy-Structured (HS) algorithm. CHB works perfectly in theory and provides a deep insight into the landmark selection problem. At the same time, it is noticed that CHB is centralized and sensitive to errors in distance estimation. The CNE algorithm is a distributed approach, devised to gradually exclude the nodes in the “center” of the network until the desired number of nodes are left, which are employed as landmarks. While CNE works effectively in a small network, its high order computation complexity and communication overhead may eventually lead to scalability problem when it is applied in very large networks. To address this problem, we propose the HS algorithm for striking the balance between accuracy and complexity/overhead. In HS, we establish a hierarchical structure with multiple layers, and apply the CNE algorithm in an appropriate layer to identify an initial set of candidate nodes. The outcomes are then rectified through a recursive process, yielding the final landmarks. Three applications, including coordinates establishment, border detection, and landmark-based routing in general networks without location information, are introduced based on the selected landmarks. We carry out extensive simulations to compare the performance of our landmark selection algorithms and demonstrate their effectiveness in all of the applications. 相似文献

12.

Efficient breadth first search on multi-GPU systems

Enrico Mastrostefano Massimo Bernaschi 《Journal of Parallel and Distributed Computing》2013

Simple algorithms for the execution of a Breadth First Search on large graphs lead, running on clusters of GPUs, to a situation of load unbalance among threads and un-coalesced memory accesses, resulting in pretty low performances. To obtain a significant improvement on a single GPU and to scale by using multiple GPUs, we resort to a suitable combination of operations to rearrange data before processing them. We propose a novel technique for mapping threads to data that achieves a perfect load balance by leveraging prefix-sum and binary search operations. To reduce the communication overhead, we perform a pruning operation on the set of edges that needs to be exchanged at each BFS level. The result is an algorithm that exploits at its best the parallelism available on a single GPU and minimizes communication among GPUs. We show that a cluster of GPUs can efficiently perform a distributed BFS on graphs with billions of nodes. 相似文献

13.

Resource allocation for primary-site fault-tolerant systems

Huang Y. Tripathi S.K. 《IEEE transactions on pattern analysis and machine intelligence》1993,19(2):108-119

Resource allocation for a distributed system employing the primary site approach for fault tolerance is discussed. Two kinds of systems are considered. The first consists of fault-tolerant nodes where each node has many duplicated servers. One server is the primary, which serves user requests, and the rest are backup. The second does not have fault-tolerant nodes. To tolerate node failures, each node uses other nodes as backups. When a node fails, all requests initially allocated to the node are served by one of its backups. To study the resource allocation for such systems, an approximate model for each system is developed. Using these models, efficient allocation algorithms that take into account the failure/repair rates of the system and the fault-tolerant overheads are presented. Using experimental results, it is shown that the algorithms give the optimal or suboptimal allocations. The algorithms, which incur little overhead, can improve the system performance significantly over an intuitive allocation algorithm 相似文献

14.

A hybrid approach to adaptive load sharing and its performance

Marco Avvenuti Luigi Rizzo Lorenzo Vicisano 《Journal of Systems Architecture》1997,42(9-10):679-696

The average response time of tasks in a distributed system depends on the strategy by which workload is shared among the nodes of the system. A common approach to load sharing is to resort to some distributed algorithm that arranges for task transfer between nodes based on information on the system's state. In this paper, we depict a hybrid approach to adaptive load sharing which outperforms existing algorithms, and is especially effective in response to peaks of workload, under both heavy and light system load conditions. The strategy we propose is novel in that it relies on a fully distributed algorithm when the system is heavily loaded, but resorts to a centrally coordinated one when parts of the system become idle. The transition from one algorithm to the other is performed automatically, and the simplicity of the algorithms proposed makes it possible to use a centralized component without incurring in scalability problems and presenting instabilities. Both algorithms are very lightweight and do not need any tuning of parameters. Simulations show that the hybrid approach performs well under all load conditions and task generation patterns, it is weakly sensitive to processing overhead and communication delays, and scales well (to hundred of nodes) despite the use of a centralized component. 相似文献

15.

Game-Theoretic Approach for Load Balancing in Computational Grids 总被引：1，自引：0，他引：1

Subrata R. Zomaya A.Y. Landfeldt B. 《Parallel and Distributed Systems, IEEE Transactions on》2008,19(1):66-76

Load balancing is a very important and complex problem in computational grids. A computational grid differs from traditional high-performance computing systems in the heterogeneity of the computing nodes, as well as the communication links that connect the different nodes together. There is a need to develop algorithms that can capture this complexity yet can be easily implemented and used to solve a wide range of load-balancing scenarios. In this paper, we propose a game-theoretic solution to the grid load-balancing problem. The algorithm developed combines the inherent efficiency of the centralized approach and the fault-tolerant nature of the distributed, decentralized approach. We model the grid load-balancing problem as a noncooperative game, whereby the objective is to reach the Nash equilibrium. Experiments were conducted to show the applicability of the proposed approaches. One advantage of our scheme is the relatively low overhead and robust performance against inaccuracies in performance prediction information. 相似文献

16.

DNS dispatching algorithms with state estimators for scalable Web‐server clusters

Cardellini Valeria Colajanni Michele Yu Philip S. 《World Wide Web》1999,2(3):101-113

Replication of information across a server cluster provides a promising way to support popular Web sites. However, a Web‐server cluster requires some mechanism for the scheduling of requests to the most available server. One common approach is to use the cluster Domain Name System (DNS) as a centralized dispatcher. The main problem is that WWW address caching mechanisms (although reducing network traffic) only let this DNS dispatcher control a very small fraction of the requests reaching the Web‐server cluster. The non‐uniformity of the load from different client domains, and the high variability of real Web workload introduce additional degrees of complexity to the load balancing issue. These characteristics make existing scheduling algorithms for traditional distributed systems not applicable to control the load of Web‐server clusters and motivate the research on entirely new DNS policies that require some system state information. We analyze various DNS dispatching policies under realistic situations where state information needs to be estimated with low computation and communication overhead so as to be applicable to a Web cluster architecture. In a model of realistic scenarios for the Web cluster, a large set of simulation experiments shows that, by incorporating the proposed state estimators into the dispatching policies, the effectiveness of the DNS scheduling algorithms can improve substantially, in particular if compared to the results of DNS algorithms not using adequate state information. This revised version was published online in August 2006 with corrections to the Cover Date. 相似文献

17.

Approximate algorithms for document placement in distributed Web servers

Tse S.S.H. 《Parallel and Distributed Systems, IEEE Transactions on》2005,16(6):489-496

We study approximate algorithms for placing a set of documents into M distributed Web servers in this paper. We define the load of a server to be the summation of loads induced by all documents stored. The size of a server is defined in a similar manner. We propose five algorithms. Algorithm 1 balances the loads and sizes of the servers by limiting the loads to k/sub l/ and the sizes to k/sub s/ times their optimal values, where 1/k/sub l/-1 + 1/k/sub n/-1. This result improves the bounds on load and size of servers in (L.C. Chen et al., 2001). Algorithm 2 further reduces the load bound on each server by using partial document replication, and algorithm 3 by sorting. Algorithm 4 employs both partial replication and sorting. Last, without using sorting and replication, we give algorithm 5 for the dynamic placement at the cost of a factor Q(log M) in the time-complexity. 相似文献

18.

A Scalable Asynchronous Cache Consistency Scheme (SACCS) for mobile environments

Wang Z. Das S.K. Che H. Mohan Kumar 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(11):983-995

In the literature, there exit two types of cache consistency maintenance algorithms for mobile computing environments: stateless and stateful. In a stateless approach, the server is unaware of the cache contents at a mobile user (MU). Even though stateless approaches employ simple database management schemes, they lack scalability and ability to support user disconnectedness and mobility. On the other hand, a stateful approach is scalable for large database systems at the cost of nontrivial overhead due to server database management. We propose a novel algorithm, called Scalable Asynchronous Cache Consistency Scheme (SACCS), which inherits the positive features of both stateless and stateful approaches. SACCS provides a weak cache consistency for unreliable communication (e.g., wireless mobile) environments with small stale cache hit probability. It is also a highly scalable algorithm with minimum database management overhead. The properties are accomplished through the use of flag bits at the server cache (SC) and MU cache (MUC), an identifier (ID) in MUC for each entry after its invalidation, and estimated time-to-live (TTL) for each cached entry, as well as rendering of all valid entries of MUC to uncertain state when an MU wakes up. The stale cache hit probability is analyzed and also simulated under the Rayleigh fading model of error-prone wireless channels. Comprehensive simulation results show that the performance of SACCS is superior to those of other existing stateful and stateless algorithms in both single and multicell mobile environments. 相似文献

19.

Resource scheduling in a high-performance multimedia server

HweeHwa Pang Jose B. Krishnan M.S. 《Knowledge and Data Engineering, IEEE Transactions on》1999,11(2):303-320

Supporting continuous media data-such as video and audio-imposes stringent demands on the retrieval performance of a multimedia server. In this paper, we propose and evaluate a set of data placement and retrieval algorithms to exploit the full capacity of the disks in a multimedia server. The data placement algorithm declusters every object over all of the disks in the server-using a time-based declustering unit-with the aim of balancing the disk load. As for runtime retrieval, the quintessence of the algorithm is to give each disk advance notification of the blocks that have to be fetched in the impending time periods, so that the disk can optimize its service schedule accordingly. Moreover, in processing a block request for a replicated object, the server will dynamically channel the retrieval operation to the most lightly loaded disk that holds a copy of the required block. We have implemented a multimedia server based on these algorithms. Performance tests reveal that the server achieves very high disk efficiency. Specifically, each disk is able to support up to 25 MPEG-1 streams. Moreover, experiments suggest that the aggregate retrieval capacity of the server scales almost linearly with the number of disks 相似文献

20.

Batching and dynamic allocation techniques for increasing the stream capacity of an on-demand media server

《Parallel Computing》1997,23(12):1727-1742

A server for an interactive distributed multimedia system may require thousands of gigabytes of storage space and high I/O bandwidth. In order to maximize system utilization, and thus minimize cost, the load must be balanced among the server's disks, interconnection network and scheduler. Many algorithms for maximizing retrieval capacity from the storage system have been proposed. This paper presents techniques for improving server capacity by assigning media requests to the nodes of a server so as to balance the load on the interconnection network and the scheduling nodes. Five policies for dynamic request assignment are developed. An important factor that affects data retrieval in a high-performance continuous media server is the degree of parallelism of data retrieval. The performance of the dynamic policies on an implementation of a server model developed earlier is presented for two values of the degree of parallelism. 相似文献