期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Efficient support for content‐aware request distribution and persistent connection in Web clusters

Ho‐Han Liu Mei‐Ling Chiang Men‐Chao Wu 《Software》2007,37(11):1215-1241

To support Web clusters with efficient dispatching mechanisms and policies, we propose a light‐weight TCP connection transfer mechanism, TCP Rebuilding, and use it to develop a content‐aware request dispatching platform, LVS‐CAD, in which the request dispatcher can extract and analyze the content in requests and then dispatch each request by its content or type of service requested. To efficiently support HTTP/1.1 persistent connection in Web clusters, request scheduling should be performed per request rather than per connection. Consequently, multiple TCP Rebuilding, as an extension to normal TCP Rebuilding, is proposed and implemented. On this platform, we also devise fast TCP module handshaking to process the handshaking between clients and the request dispatcher in the IP layer instead of in the TCP layer for faster response times. Furthermore, we also propose content‐aware request distribution policies that consider cache locality and various types of costs for dispatching requests in this platform, which makes the resourceutilization of Web servers more effective. Experimental results of a practical implementation on Linux show that the proposed system, mechanisms, and policies can effectively improve the performance of Web clusters. Copyright © 2007 John Wiley & Sons, Ltd. 相似文献

2.

Dynamic information‐based scalable hashing on a cluster of web cache servers

Hukeun Kwak Andrew Sohn Kyusik Chung 《Concurrency and Computation》2012,24(3):322-340

Caching web pages is an important part of web infrastructures. Medium to large‐scale infrastructures deploy a cluster of servers to solve the scalability and storage problems inherent in caching. In this paper we present dynamic information‐based scalable hashing that evenly hashes client requests to a cluster of cache servers, resulting in performance scalability. Runtime information is used to determine when and how to cache pages. Cached pages are stored and retrieved mutually exclusively to/from all the servers to minimize the use of storage, resulting in storage scalability. We set up an experimental environment consisting of various machines, including client servers, a cluster of 16 cache servers, and a load balancer. We demonstrate through experimental results that dynamic information‐based scalable hashing maximizes both performance scalability and storage scalability while the existing approaches do only either one of the two. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

3.

Weblins: A scalable WWW cluster-based server

《Advances in Engineering Software》2006,37(1):11-19

With the ever-growing web traffic, cluster-based web servers have become very important to the Internet infrastructure. Thus, making the best use of all available resources in the cluster to achieve high performance is a significant research issue. In this paper, we present Weblins, a cluster-based web server that can achieve good throughput. Weblins has Gobelins operating system as platform. Gobelins is an efficient single system image operating system that transparently makes use of the resources available in the cluster. The architecture of Weblins is fully distributed. Weblins implements a content-aware request distribution policy via a new interface on top of Gobelins. Popular web files are dynamically replicated on all nodes via a cooperative caching mechanism. For the non-popular files, the requests are handed-off to the corresponding nodes via the TCP Handoff protocol. Simulation results show that the strategy used by Weblins is more suitable for cluster-based Web severs in comparison with pure content-aware strategy and pure cooperative caching strategy. 相似文献

4.

Web缓存技术的应用与研究 总被引：6，自引：0，他引：6

黄敏张卫东李众立《计算机工程与设计》2003,24(5):30-31

缓存技术是一种在本地存储经常访问的信息的方法。通过将普通的Web请求重定向到本地Cache服务器，减少WAN链路和Web服务器上的传输量，为ISP、企业网络及终端用户提供了较好的网络出口解决方案。相似文献

5.

Characterization and Evaluation of Cache Hierarchies for Web Servers

Iyer Ravi 《World Wide Web》2004,7(3):259-280

As Internet usage continues to expand rapidly, careful attention needs to be paid to the design of Internet servers for achieving high performance and end-user satisfaction. Currently, the memory system continues to remain a significant performance bottleneck for Internet servers employing multi-GHz processors. In this paper, our aim is two-fold: (1) to characterize the cache/memory performance of web server workloads and (2) to propose and evaluate cache design alternatives for future web servers. We chose SPECweb99 as the representative web server workload and our entire characterization and evaluation methodology is based on our CASPER simulation framework. We begin by exploring the processor cache design space for single and dual-processor servers. Based on our observations, we then evaluate other cache hierarchy alternatives such as chipset caches, coherence filters and decompressed page stores. We show the sensitivity of these components to basic organization parameters such as cache size, line size and degree of associativity. We also present the performance implications of routing memory requests initiated by I/O devices through these caches. Based on detailed simulation data and its implications on system level performance, this paper shows that chipset caches have significant potential for improving future web server performance. 相似文献

6.

Form-based proxy caching for database-backed web sites: keywords and functions 总被引：1，自引：0，他引：1

Qiong Luo Jeffrey F. Naughton Wenwei Xue 《The VLDB Journal The International Journal on Very Large Data Bases》2008,17(3):489-513

Web caching proxy servers are essential for improving web performance and scalability, and recent research has focused on making proxy caching work for database-backed web sites. In this paper, we explore a new proxy caching framework that exploits the query semantics of HTML forms. We identify two common classes of form-based queries from real-world database-backed web sites, namely, keyword-based queries and function-embedded queries. Using typical examples of these queries, we study two representative caching schemes within our framework: (i) traditional passive query caching, and (ii) active query caching, in which the proxy cache can service a request by evaluating a query over the contents of the cache. Results from our experimental implementation show that our form-based proxy is a general and flexible approach that efficiently enables active caching schemes for database-backed web sites. Furthermore, handling query containment at the proxy yields significant performance advantages over passive query caching, but extending the power of the active cache to do full semantic caching appears to be less generally effective. 相似文献

7.

一种Web集群系统的动态分离式调度策略 总被引：1，自引：0，他引：1

下载免费PDF全文

汤迪斌倪宏陈晓《计算机工程与应用》2008,44(16):1-3

静态分离式调度策略（SSSP）不能有效地分配服务器资源。动态分离式调度策略（DSSP）对静态请求和动态请求分别以请求的文件和用户会话为单元进行调度。请求分发器监测后端服务器的状态,按资源使用情况将服务器区分为轻载、重载和过载,轻载服务器可以接收新的请求单元,重载服务器不接收新请求单元,但继续为已接收的请求单元服务,过载服务器迁移部分请求单元到轻载服务器。试验结果表明,DSSP的效率明显优于SSSP。相似文献

8.

基于内容的网络集群负载平衡算法模型 总被引：1，自引：0，他引：1

谢红薇谢显宇《计算机应用与软件》2010,27(1):131-133,144

在论述网络集群负载平衡算法的基础上,基于内容分类的方法,给出基于内容的网络集群负载平衡算法三元组模型。请求分类有利于提高缓存命中率,调度机制说明如何适当地转发请求,动态反馈避免将请求分配到重载的服务器,进而分析了调度机制的八种调度策略和六种基于内容的调度转发技术。该模型利用缓存内容来提高集群的吞吐量和响应时间,可部署多种服务类型。相似文献

9.

An integrated method for real time and offline web robot detection

下载免费PDF全文

Derek Doran Swapna S. Gokhale 《Expert Systems》2016,33(6):592-606

Recent academic and industry reports confirm that web robots dominate the traffic seen by web servers across the Internet. Because web robots crawl in an unregulated fashion, they may threaten the privacy, function, performance, and security of web servers. There is therefore a growing need to be able to identify robot visitors automatically, in offline and in real time, to assess their impact and to potentially protect web servers from abusive bots. Yet contemporary detection approaches, which rely on syntactic log analysis, finding statistical variations between robot and human traffic, analytical learning techniques, or complex software modifications may not be realistic to implement or remain effective as the behavior of robots evolve over time. Instead, this paper presents a novel detection approach that relies on the differences in the resource request patterns of web robots and humans. It rationalizes why differences in resource request patterns are expected to remain intrinsic to robots and humans despite the continuous evolution of their traffic. The performance of the approach, adoptable for both offline and real time settings with a simple implementation, is demonstrated by playing back streams of actual web traffic with varying session lengths and proportions of robot requests. 相似文献

10.

A neural network proxy cache replacement strategy and its implementation in the Squid proxy server

Sam Romano Hala ElAarag 《Neural computing & applications》2011,20(1):59-78

As the Internet has become a more central aspect for information technology, so have concerns with supplying enough bandwidth and serving web requests to end users in an appropriate time frame. Web caching was introduced in the 1990s to help decrease network traffic, lessen user perceived lag, and reduce loads on origin servers by storing copies of web objects on servers closer to end users as opposed to forwarding all requests to the origin servers. Since web caches have limited space, web caches must effectively decide which objects are worth caching or replacing for other objects. This problem is known as cache replacement. We used neural networks to solve this problem and proposed the Neural Network Proxy Cache Replacement (NNPCR) method. The goal of this research is to implement NNPCR in a real environment like Squid proxy server. In order to do so, we propose an improved strategy of NNPCR referred to as NNPCR-2. We show how the improved model can be trained with up to twelve times more data and gain a 5–10% increase in Correct Classification Ratio (CCR) than NNPCR. We implemented NNPCR-2 in Squid proxy server and compared it with four other cache replacement strategies. In this paper, we use 84 times more data than NNPCR was tested against and present exhaustive test results for NNPCR-2 with different trace files and neural network structures. Our results demonstrate that NNPCR-2 made important, balanced decisions in relation to the hit rate and byte hit rate; the two performance metrics most commonly used to measure the performance of web proxy caches. 相似文献

11.

Performance aspects of distributed caches using TTL-based consistency

《Theoretical computer science》2005,331(1):73-96

The web is the largest distributed database deploying time-to-live-based weak consistency. Each object has a lifetime-duration assigned to it by its origin server. A copy of the object fetched from its origin server is received with maximum time-to-live (TTL) that equals its lifetime duration. In contrast a copy obtained through a cache have shorter TTL since the age (elapsed time since fetched from the origin) is deducted from its lifetime duration. A request served by a cache constitutes a hit if the cache has a fresh copy of the object. Otherwise, the request is considered a miss and is propagated to another server. It is evident that the number of cache misses depends on the age of the copies the cache receives. Thus, a cache that sends requests to another cache would suffer more misses than a cache that sends requests directly to an authoritative server.In this paper, we model and analyze the effect of age on the performance of various cache configurations. We consider a low-level cache that fetches objects either from their origin servers or from other caches and analyze its miss-rate as function of its fetching policy. We distinguish between three basic fetching policies, namely, fetching always from the origin, fetching always from the same high-level cache, and fetching from a “random” high-level cache. We explore the relationships between these policies in terms of the miss-rate achieved by the low-level cache, both on worst-case sequences, and on sequences generated using particular probability distributions.Guided by web caching practice, we consider two variations of the basic policies. In the first variation the high-level cache uses pre-term refreshes to keep a copy with lower age. In the second variation the low-level cache uses extended lifetime duration. We analyze how these variations affect the miss-rates. Our theoretical results help to understand how age may affect the miss-rate, and imply guidelines for improving performance of web caches. 相似文献

12.

Web集群服务器的分离式调度策略 总被引：9，自引：3，他引：9

雷迎春张松李国杰《计算机研究与发展》2002,39(9):1093-1098

主要用排队论方法讨论了Web集群整体性能与请求调度策略之间的关系，所获得的结论是：在Web集群非过载情况下，一部分后端服务器仅处理静态请求而另一部分后端服务器仅处理动态请求的分离式调度策略要好于所有后端服务器既处理静态请求又处理动态请求的混合式调度策略。用SPECweb99测试工具所做的实际测试更进一步证明：当负载参数为120个连接时，采用分离式调度策略的Web集群服务器可完成63个连接，而采用混合式调度策略的Web集群服务器仅能完成36个连接，性能提高了22.5%。相似文献

13.

基于内容的分布式Web服务器调度算法 总被引：3，自引：0，他引：3

杜增凯郑名扬鞠九滨《软件学报》2003,14(12):2068-2073

在分布式Web服务系统的研究中,基于内容的调度策略日益受到关注.但是,基于内容的请求调度带来的额外开销使得调度节点成为系统的瓶颈,限制了系统规模.为了实现系统的容错和扩展,集中讨论了分布式调度策略的设计问题,并针对难于分布的面向缓存调度策略设计了相应的分布式调度算法DWARD(distributed workload-aware request distribution).基于LINUX IP协议栈的系统测试表明,DWARD算法可以在适当调整的情况下获得良好的性能. 相似文献

14.

Web缓存服务器技术研究与应用 总被引：7，自引：3，他引：4

许艳美肖宗水梁昇《计算机工程与设计》2005,26(1):126-128

Web缓存服务器系统正在Internet及局域网上广泛地应用,对它所采用的技术做了较深入的讨论,指出利用Web Cache技术,可减少网络流量,节约资金,提高带宽利用率;同时在Cache服务器端进行内容分析过滤,可提高过滤质量和效率,有效防止有害信息的进一步传播。相似文献

15.

A content-based load balancing algorithm with admission control for cluster web servers 总被引：2，自引：0，他引：2

Saeed Seyed A. Mohammad K. 《Future Generation Computer Systems》2008,24(8):775-787

相似文献

16.

An approximation-based load-balancing algorithm with admission control for cluster web servers with dynamic workloads

Saeed Sharifian Seyed A. Motamedi Mohammad K. Akbari 《The Journal of supercomputing》2010,53(3):440-463

The growth of web-based applications in business and e-commerce is building up demands for high performance web servers for better throughputs and lower user-perceived latency. These demands are leading to a widespread substitution of powerful single servers by robust newcomers, cluster web servers, in many enterprise companies. In this respect the load-balancing algorithms play an important role in boosting the performance of cluster servers. The previous load-balancing algorithms which were designed for the handling of static contents in web services suffer from significant performance degradation under dynamic and database-driven workloads. Regarding this, we propose an approximation-based load-balancing algorithm with admission control for cluster-based web servers in this study. Since it is difficult to accurately determine the loads of web servers through feedbacks from distributed agents in web servers, we propose an analytical model of a web server to estimate the web servers’ loads. To achieve this, the algorithm classifies requests based on their service times and track numbers of outstanding requests for each class of each web server node and also based on their resource demands to dynamically estimate the loads of each node. For the error handling of the model a proportional integral (PI) controller from control theory is used. Then the estimated available capacity of each web server is used for load balancing and admission control decisions. The implementation results with a standard benchmark confirm the effectiveness of the proposed scheme, which improves both the mean response time and the throughput of the cluster compared to rival load-balancing algorithms, and also avoids situations in which the cluster is overloaded, even when the request rates are beyond the cluster capacity. 相似文献

17.

Extending Proxy Caching Capability: Issues and Performance

Wei Hao Jicheng Fu Jiang He I-Ling Yen Farokh Bastani Ing-Ray Chen 《World Wide Web》2006,9(3):253-275

Proxy caching is an effective approach to reduce the response latency to client requests, web server load, and network traffic. Recently there has been a major shift in the usage of the Web. Emerging web applications require increasing amount of server-side processing. Current proxy protocols do not support caching and execution of web processing units. In this paper, we present a weblet environment, in which, processing units on web servers are implemented as weblets. These weblets can migrate from web servers to proxy servers to perform required computation and provide faster responses. Weblet engine is developed to provide the execution environment on proxy servers as well as web servers to facilitate uniform weblet execution. We have conducted thorough experimental studies to investigate the performance of the weblet approach. We modify the industrial standard e-commerce benchmark TPC-W to fit the weblet model and use its workload model for performance comparisons. The experimental results show that the weblet environment significantly improves system performance in terms of client response latency, web server throughput, and workload. Our prototype weblet system also demonstrates the feasibility of integrating weblet environment with current web/proxy infrastructure. 相似文献

18.

面向直播HTTP Streaming系统的HTTP缓存服务器行为优化

李云飞谢伟凯鲁晨平张智强申瑞民《计算机工程与应用》2012,48(10):68-74

HTTP缓存服务器是提高HTTP Streaming系统客户并发量的关键环节。但当前主流HTTP缓存服务器,如Nginx、Squid、Varnish等,在缓存资源更新期间的行为都存在不足,当被应用在面向直播的HTTP Streaming系统中时,会周期性地把大量客户端请求转发至源服务器,从而制约了HTTP Streaming系统的可伸缩性。提出一种优化的HTTP缓存服务器在缓存更新期间的行为,即缓存服务器仅向源服务器转发一路客户端请求,缓存更新期间,拒绝其他关于该资源的请求。优化策略在使用最为广泛的Nginx服务器的基础上进行了实现。实验证明,优化后系统的伸缩性得到了显著提高。相似文献

19.

Exploiting Regularities in Web Traffic Patterns for Cache Replacement 总被引：2，自引：0，他引：2

Cohen Kaplan 《Algorithmica》2002,33(3):300-334

Abstract. Caching web pages at proxies and in web servers' memories can greatly enhance performance. Proxy caching is known to reduce network load and both proxy and server caching can significantly decrease latency. Web caching problems have different properties than traditional operating systems caching, and cache replacement can benefit by recognizing and exploiting these differences. We address two aspects of the predictability of traffic patterns: the overall load experienced by large proxy and web servers, and the distinct access patterns of individual pages. We formalize the notion of ``cache load' under various replacement policies, including LRU and LFU, and demonstrate that the trace of a large proxy server exhibits regular load. Predictable load allows for improved design, analysis, and experimental evaluation of replacement policies. We provide a simple and (near) optimal replacement policy when each page request has an associated distribution function on the next request time of the page. Without the predictable load assumption, no such online policy is possible and it is known that even obtaining an offline optimum is hard. For experiments, predictable load enables comparing and evaluating cache replacement policies using partial traces , containing requests made to only a subset of the pages. Our results are based on considering a simpler caching model which we call the interval caching model . We relate traditional and interval caching policies under predictable load, and derive (near)-optimal replacement policies from their optimal interval caching counterparts. 相似文献

20.

用MPLS实现的可伸缩Web请求路由

周颖赵岳松《计算机工程》2003,29(16):172-174

对相关Web分配器技术进行分析，并提出一个新颖的构架来改进在服务器和缓存集群中Web请求路由，MPLS方案利用应用层信息到第二层标签以提高复杂的请求路由功能，而不会发生TCP连接终止瓶颈。需要客户端代理服务器参与为客户请求申请到合适的标签，可用off-the-shelf MPLS交换去执行分配。允许分配器执行一些关健功能实现可伸缩性。相似文献