期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Adaptive-level memory caches on World Wide Web servers

《Computer Networks》2000,32(3):261-275

Owing to the fast growth of World Wide Web (WWW), web traffic has become a major component of Internet traffics. Consequently, the reduction of document retrieval latency on WWW becomes more and more important. The latency can be reduced in two ways: reduction of network delay and improvement of web servers’ throughput. Our research aims at improving a web server’s throughput by keeping a memory cache in a web server’s address space.In this paper, we focus on the design and implementation of a memory cache scheme. We propose a novel web cache management policy named the adaptive-level policy that either caches the whole file content or only a portion of it, according to the file size. The experimental results show three things. First, our memory cache is beneficial since, under our experimental workloads, the throughput improvement can achieve 32.7%. Second, our cache management policy is suitable for current web traffic. Third, with the increasing popularity of multimedia files, our policy will outperform others currently used in WWW. 相似文献

2.

A neural network proxy cache replacement strategy and its implementation in the Squid proxy server

Sam Romano Hala ElAarag 《Neural computing & applications》2011,20(1):59-78

As the Internet has become a more central aspect for information technology, so have concerns with supplying enough bandwidth and serving web requests to end users in an appropriate time frame. Web caching was introduced in the 1990s to help decrease network traffic, lessen user perceived lag, and reduce loads on origin servers by storing copies of web objects on servers closer to end users as opposed to forwarding all requests to the origin servers. Since web caches have limited space, web caches must effectively decide which objects are worth caching or replacing for other objects. This problem is known as cache replacement. We used neural networks to solve this problem and proposed the Neural Network Proxy Cache Replacement (NNPCR) method. The goal of this research is to implement NNPCR in a real environment like Squid proxy server. In order to do so, we propose an improved strategy of NNPCR referred to as NNPCR-2. We show how the improved model can be trained with up to twelve times more data and gain a 5–10% increase in Correct Classification Ratio (CCR) than NNPCR. We implemented NNPCR-2 in Squid proxy server and compared it with four other cache replacement strategies. In this paper, we use 84 times more data than NNPCR was tested against and present exhaustive test results for NNPCR-2 with different trace files and neural network structures. Our results demonstrate that NNPCR-2 made important, balanced decisions in relation to the hit rate and byte hit rate; the two performance metrics most commonly used to measure the performance of web proxy caches. 相似文献

3.

Design and implementation of an efficient web cluster with content-based request distribution and file caching

Mei-Ling Chiang^{Author Vitae} Yu-Chen Lin Author Vitae Author Vitae 《Journal of Systems and Software》2008,81(11):2044-2058

We have implemented an efficient and scalable web cluster named LVS-CAD/FC (i.e. LVS with Content-Aware Dispatching and File Caching). In LVS-CAD/FC, a kernel-level one-way content-aware web switch based on TCP Rebuilding is implemented to examine and distribute the HTTP requests from clients to web servers, and the fast Multiple TCP Rebuilding is implemented to efficiently support persistent connection. Besides, a file-based web cache stores a small set of the most frequently accessed web files in server RAM to reduce disk I/Os and a light-weight redirect method is developed to efficiently redirect requests to this cache. In this paper, we have further proposed new policies related to content-based workload-aware request distribution, in which the web switch considers the content of requests and workload characterization in request dispatching. In particular, web files with more access frequencies would be duplicated in more servers’ file-based caches, such that hot web files can be served by more servers. Our goals are to improve cluster performance by obtaining better memory utilization and increasing the cache hit rates while achieving load balancing among servers. Experimental results of practical implementation on Linux show that LVS-CAD/FC is efficient and scales well. Besides, LVS-CAD/FC with the proposed policies can achieve 66.89% better performance than the Linux Virtual Server with a content-blind web switch. 相似文献

4.

Architecture of a Web server accelerator

《Computer Networks》2002,38(1):75-97

We describe the design, implementation and performance of a high-performance Web server accelerator which runs on an embedded operating system and improves Web server performance by caching data. It can serve Web data at rates an order of magnitude higher than that which would be achieved by a high-performance Web server running on similar hardware under a conventional operating system such as Unix or NT. The superior performance of our system results in part from its highly optimized communications stack. In order to maximize hit rates and maintain updated caches, our accelerator provides an API which allows application programs to explicitly add, delete, and update cached data. The API allows our accelerator to cache dynamic as well as static data. We describe how our accelerator can be scaled to multiple processors to increase performance and availability. The basic design alternatives include a content router or a TCP router (without content routing) in front of a set of Web cache accelerator nodes, with the cache memory distributed across the accelerator nodes. Content-based routing reduces cache node CPU cycles but can make the front-end router a bottleneck. With the TCP router, a request for a cached object may initially be sent to the wrong cache node; this results in larger cache node CPU cycles, but can provide a higher aggregate throughput, because the TCP router becomes a bottleneck at a higher throughput than the content router. We quantify the throughput ranges in which different designs are preferable. We also examine a combination of content-based and TCP routing techniques. In addition, we present statistics from critical deployments of our accelerator for improving performance at highly accessed Sporting and Event Web sites hosted by IBM. 相似文献

5.

紧耦合的流媒体缓存代理协作机制研究

杨传栋余镇危王行刚《计算机工程》2006,32(17):167-169

位于因特网骨干网和同一接入网之间的流媒体缓存代理服务器相互协作，可以提高缓存命中率，保持负载平衡。该文提出了一种共享缓存空间的紧耦合的多代理服务器协作机制，给出了多代理协作的缓存替换策略和负载平衡算法。通过NS2模拟验证，该机制可以使系统保持更好的性能。相似文献

6.

Performance aspects of distributed caches using TTL-based consistency

《Theoretical computer science》2005,331(1):73-96

The web is the largest distributed database deploying time-to-live-based weak consistency. Each object has a lifetime-duration assigned to it by its origin server. A copy of the object fetched from its origin server is received with maximum time-to-live (TTL) that equals its lifetime duration. In contrast a copy obtained through a cache have shorter TTL since the age (elapsed time since fetched from the origin) is deducted from its lifetime duration. A request served by a cache constitutes a hit if the cache has a fresh copy of the object. Otherwise, the request is considered a miss and is propagated to another server. It is evident that the number of cache misses depends on the age of the copies the cache receives. Thus, a cache that sends requests to another cache would suffer more misses than a cache that sends requests directly to an authoritative server.In this paper, we model and analyze the effect of age on the performance of various cache configurations. We consider a low-level cache that fetches objects either from their origin servers or from other caches and analyze its miss-rate as function of its fetching policy. We distinguish between three basic fetching policies, namely, fetching always from the origin, fetching always from the same high-level cache, and fetching from a “random” high-level cache. We explore the relationships between these policies in terms of the miss-rate achieved by the low-level cache, both on worst-case sequences, and on sequences generated using particular probability distributions.Guided by web caching practice, we consider two variations of the basic policies. In the first variation the high-level cache uses pre-term refreshes to keep a copy with lower age. In the second variation the low-level cache uses extended lifetime duration. We analyze how these variations affect the miss-rates. Our theoretical results help to understand how age may affect the miss-rate, and imply guidelines for improving performance of web caches. 相似文献

7.

集中管理式Web缓存系统及性能分析 总被引：5，自引：0，他引：5

姜彩萍李子木杨凤杰《小型微型计算机系统》2004,25(8):1428-1431

共享缓存文件是减少网络通信量和服务器负载的重要方法，本文在介绍Web Caching技术及流行的Web缓存通信协议ICP的基础上，提出了一种集中管理式Web缓存系统，该系统通过将用户的HTTP请求，按照一定的算法分发到系统中某一合适的缓存服务器上，从而消除了缓存系统内部服务器之间庞大的通信开销及缓存处理负担，减少了缓存内容的冗余度．通过分析，证明了集中管理式Web缓存系统比基于ICP的简单缓存系统具有缓存效率高、处理开销低、延迟小等优点，并且该系统具有良好的可扩展性．相似文献

8.

Performance of One''s Complement Caches

Qing Yang Sridar Adina T. Sun 《Journal of Parallel and Distributed Computing》1998,48(2):143

On-chip caches to reduce average memory access latency are commonplace in today's commercial microprocessors. These on-chip caches generally have low associativity and small cache sizes. Cache line conflicts are the main source of cache misses, which are critical for overall system performance. This paper introduces an innovative design for on-chip data caches of microprocessors, called one's complement cache. While binary complement numbers have been successfully used in designing arithmetic units, to the best of our knowledge, no one has ever considered using such complement numbers in cache memory designs. This paper will show that such complement numbers help greatly in reducing cache misses in a data cache, thereby improving data cache performance. By parallel computation of cache addresses and memory addresses, the new design does not increase the critical hit time of cache accesses. Cache misses caused by line interference are reduced by evenly distributing data items referenced by program loops across all sets in a cache. Even distribution of data in the cache is achieved by making the number of sets in the cache a prime or an odd number, so that the chance of related data being mapped to a same set is small. Trace-driven simulations are used to evaluate the performance of the new design. Performance results on benchmarks show that the new design improves cache performance significantly with negligible additional hardware cost. 相似文献

9.

Tulip: A New Hash Based Cooperative Web Caching Architecture

Zhiyong Xu Laxmi Bhuyan Yiming Hu 《The Journal of supercomputing》2006,35(3):301-320

With the exponential growth of WWW traffic, web proxy caching becomes a critical technique for Internet web services. Well-organized proxy caching systems with multiple servers can greatly reduce the user perceived latency and decrease the network bandwidth consumption. Thus, many research papers focused on improving web caching performance with the efficient coordination algorithms among multiple servers. Hash based algorithm is the most widely used server coordination mechanism, however, there's still a lot of technical issues need to be addressed. In this paper, we propose a new hash based web caching architecture, Tulip. Tulip aggregates web objects that are likely to be accessed together into object clusters and uses object clusters as the primary access units. Tulip extends the locality-based algorithm in UCFS to hash based web proxy systems and proposes a simple algorithm to reduce the data grouping overhead. It takes into consideration the access speed dispatch between memory and disk and replaces expensive small disk I/O with less large ones. In case a client request cannot be fulfilled by the server in the memory, the system fetches the whole cluster which contains the required object into memory, the future requests for other objects in the same cluster can be satisfied directly from memory and slow disk I/Os are avoided. It also introduces a simple and efficient data dupllication algorithm, few maintenance work need to be done in case of server join/leave or server failure. Along with the local caching strategy, Tulip achieves better fault tolerance and load balance capability with the minimal cost. Our simulation results show Tulip has better performance than previous approaches. 相似文献

10.

Second-level buffer cache management 总被引：2，自引：0，他引：2

Zhou Y. Chen Z. Li K. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(6):505-519

Buffer caches are commonly used in servers to reduce the number of slow disk accesses or network messages. These buffer caches form a multilevel buffer cache hierarchy. In such a hierarchy, second-level buffer caches have different access patterns from first-level buffer caches because accesses to a second-level are actually misses from a first-level. Therefore, commonly used cache management algorithms such as the least recently used (LRU) replacement algorithm that work well for single-level buffer caches may not work well for second-level. We investigate multiple approaches to effectively manage second-level buffer caches. In particular, we report our research results in 1) second-level buffer cache access pattern characterization, 2) a new local algorithm called multi-queue (MQ) that performs better than nine tested alternative algorithms for second-level buffer caches, 3) a set of global algorithms that manage a multilevel buffer cache hierarchy globally and significantly improve second-level buffer cache hit ratios over corresponding local algorithms, and 4) implementation and evaluation of these algorithms in a real storage system connected with commercial database servers (Microsoft SQL server and Oracle) running industrial-strength online transaction processing benchmarks. 相似文献

11.

嵌入式浏览器缓存策略的设计与实现

胡贯荣阳富民《计算机工程与设计》2005,26(12):3362-3364

给出了一种嵌入式浏览器缓存的实现策略,将网络数据进行分类,通过使用内存缓存技术,合理地缓冲网络数据,同时根据网页的结构和访问信息,使用一种简单可行的缓存淘汰策略,充分地利用缓存资源,使系统具有了较好的性能。相似文献

12.

Replication for Load Balancing and Hot-Spot Relief on Proxy Web Caches with Hash Routing 总被引：1，自引：0，他引：1

Kun-Lung Wu Philip S. Yu 《Distributed and Parallel Databases》2003,13(2):203-220

Hash routing is an emerging approach to coordinating a collection of collaborative proxy caches. Hash routing partitions the entire URL space among the proxy caches. Each partition is assigned to a cache server. Duplication of cache contents is eliminated. Client requests to a cache server for non-assigned-partition objects are forwarded to proper sibling caches. In the presence of access skew, the load level of the cache servers can be quite unbalanced, limiting the benefits of hash routing.We examine an adaptable controlled replication (ACR) of non-assigned-partition objects in each cache server to reduce the load imbalance and relieve the problem of hot-spot references. Trace-driven simulations are conducted to study the effectiveness of ACR. The results show that (1) access skew exists, and the load of the cache servers tends to be unbalanced in hash routing; (2) with a relatively small amount of ACR, say 10% of the cachesize, significant improvements in load balance can be achieved; (3) ACR provides a very effective remedy for load imbalance due to hot-spot references; and (4) increasing the cache size does not improve load balance unless replication is allowed. 相似文献

13.

Design a progressive video caching policy for video proxy servers

Wei-hsiu Ma Du D.H.C. 《Multimedia, IEEE Transactions on》2004,6(4):599-610

Proxy servers have been used to cache web objects to alleviate the load of the web servers and to reduce network congestion on the Internet. In this paper, a central video server is connected to a proxy server via wide area networks (WANs) and the proxy server can reach many clients via local area networks (LANs). We assume a video can be either entirely or partially cached in the proxy to reduce WAN bandwidth consumption. Since the storage space and the sustained disk I/O bandwidth are limited resources in the proxy, how to efficiently utilize these resources to maximize the WAN bandwidth reduction is an important issue. We design a progressive video caching policy in which each video can be cached at several levels corresponding to cached data sizes and required WAN bandwidths. For a video, the proxy server determines to cache a smaller amount of data at a lower level or to gradually accumulate more data to reach a higher level. The proposed progressive caching policy allows the proxy to adjust caching amount for each video based on its resource condition and the user access pattern. We investigate the scenarios in which the access pattern is priorly known or unknown and the effectiveness of the caching policy is evaluated. 相似文献

14.

基于均衡数据放置策略的分布式网络存储编码缓存方案

陈雪胡玉平《计算机应用研究》2020,37(4):1194-1199

为了保证网络存储的负载平衡并避免在节点或磁盘故障的情况下造成不可恢复的损失,提出一种基于均衡数据放置策略的分布式网络存储编码缓存方案,针对大型高速缓存和小型缓存分别给出了不同的解决办法。首先,将Maddah方案扩展到多服务器系统,结合均衡数据放置策略,将每个文件作为一个单元存储在数据服务器中,从而解决大型高速缓存问题;然后,将干扰消除方案扩展到多服务器系统,利用干扰消除方案降低缓存的峰值速率,结合均衡数据放置策略,提出缓存分段的线性组合,从而解决小型缓存问题。最后,通过基于Linux的NS2仿真软件,分别在一个和两个奇偶校验服务器系统中进行仿真实验。仿真结果表明,提出的方案可以有效地降低峰值传输速率,相比其他两种较新的缓存方案,提出的方案获得了更好的性能。此外,采用分布式存储虽然限制了将来自不同服务器的内容组合成单个消息的能力,导致编码缓存方案性能损失,但可以充分利用分布式存储系统中存在的固有冗余,从而提高存储系统的性能。相似文献

15.

Shared memory multiprocessor architectures for software IP routers

Luo Y. Laxmi Narayan Bhuyan Chen X. 《Parallel and Distributed Systems, IEEE Transactions on》2003,14(12):1240-1249

We propose new shared memory multiprocessor architectures and evaluate their performance for future Internet protocol (IP) routers based on symmetric multiprocessor (SMP) and cache coherent nonuniform memory access (CC-NUMA) paradigms. We also propose a benchmark application suite, RouterBench, which consists of four categories of applications representing key functions on the time-critical path of packet processing in routers. An execution driven simulation environment is created to evaluate SMP and CC-NUMA router architectures using this RouterBench. The execution driven simulation can produce accurate cycle-level execution time prediction and reveal the impact of various architectural parameters on the performance of routers. We port the FUNET trace and its routing table for use in our experiments. We find that the CC-NUMA architecture provides an excellent scalability for design of high-performance IP routers. Results also show that the CC-NUMA architecture can sustain good lookup performance, even at a high frequency of route updates. 相似文献

16.

Cost-based cache replacement and server selection for multimedia proxy across wireless Internet

Qian Zhang Zhe Xiang Wenwu Zhu Lixin Gao 《Multimedia, IEEE Transactions on》2004,6(4):587-598

Multimedia proxy plays an important role in multimedia streaming over wireless Internet. Since wireless network exhibits different characteristics from the Internet, multimedia proxy caching over wireless Internet faces additional challenges. In this paper, we present a study of cache replacement for a single server and server selection for multiple servers across wireless Internet. By considering multiple objectives of multimedia proxy, we design a unified cost metric to measure proxy performance in wireless Internet. Based on the defined unified cost metric, we propose a novel replacement algorithm for single-server and a new server-selection policy for multiple servers to improve the end-to-end performance such as throughput, media quality, and start-up latency. To effectively handle errors occurred on wireless link, channel-adaptive unequal error protection is deployed according to distinct quality of service requirements of layered or scalable media. Simulation results demonstrate that our approaches achieve significantly better performance than the known cache-replacement algorithms and sever selection schemes, respectively. 相似文献

17.

Cacheminer: A runtime approach to exploit cache locality on SMP

Yong Yan Xiaodong Zhang 《Parallel and Distributed Systems, IEEE Transactions on》2000,11(4):357-374

Exploiting cache locality of parallel programs at runtime is a complementary approach to a compiler optimization. This is particularly important for those applications with dynamic memory access patterns. We propose a memory-layout oriented technique to exploit cache locality of parallel loops at runtime on Symmetric Multiprocessor (SMP) systems. Guided by application-dependent and targeted architecture-dependent hints, our system, called Cacheminer, reorganizes and partitions a parallel loop using the memory-access space of its execution. Through effective runtime transformations, our system maximizes the data reuse in each partitioned data region assigned in a cache, and minimizes the data sharing among the partitioned data regions assigned to all caches. The executions of tasks in the partitions are scheduled in an adaptive and locality-presented way to minimize the execution time of programs by trading off load balance and locality. We have implemented the Cacheminer runtime library on two commercial SMP servers and an SimCS simulated SMP. Our simulation and measurement results show that our runtime approach can achieve comparable performance with the compiler optimizations for programs with regular computation and memory-access patterns, whose load balance and cache locality can be well optimized by the tiling and other program transformations. However, our experimental results show that our approach is able to significantly improve the memory performance for the applications with irregular computation and dynamic memory access patterns. These types of programs are usually hard to optimize by static compiler optimizations 相似文献

18.

Performance benefits of non-volatile caches in distributed file systems

P. Biswas D. Towsley K. K. Ramakrishnan C. M. Krishna 《Concurrency and Computation》1994,6(4):289-323

We study the use of non-volatile memory for caching in distributed file systems. This provides an advantage over traditional distributed file systems in that the load is reduced at the server without making the data vulnerable to failures. We propose the use of a small non-volatile cache for writes, at the client and the file server, together with a larger volatile read cache to keep the cost of the caches reasonable. We use a synthetic workload developed from analysis of file I/O traces from commercial production systems and use a detailed simulation of the distributed environment. The service times for the resources of the system were derived from measurements performed on a typical workstation. We show that non-volatile write caches at the clients and the file server reduce the write response time and the load on the file server dramatically, thus improving the scalability of the system. We examine the comparative benefits of two alternative writeback policies for the non-volatile write cache. We show that a proposed threshold based writeback policy is more effective than a periodic writeback policy under heavy load. We also investigate the effect of varying the write cache size and show that introducing a small non-volatile cache at the client in conjunction with a moderate sized non-volatile server write cache improves the write response time by a factor of four at all load levels. 相似文献

19.

Web网站缓存设计中Cache一致性问题的研究 总被引：4，自引：0，他引：4

章强《计算机工程与设计》2003,24(4):25-29

从网站缓存实现机制入手，分析了Cache一致性对网站缓存效率的影响，重点讨论了一种通过服务器集群实现基于网站为单位的缓存设计，缓存不再针对具体的文件而是以整个网站为单位来进行查询和替换，更好地保证了Cache一致性原则在缓存设计中的体现。相似文献

20.

Network-aware partial caching for Internet streaming media

Shudong?Jin Email author Azer?Bestavros Arun?Iyengar 《Multimedia Systems》2003,9(4):386-396

The delivery of multimedia over the Internet is affected by adverse network conditions such as high packet loss rate and long delay. This paper aims at mitigating such effects by leveraging client-side caching proxies. We present a novel cache architecture and associated cache management algorithms that turn edge caches into accelerators of streaming media delivery. This architecture allows partial caching of media objects and joint delivery from caches and origin servers. Most importantly, the caching algorithms are both network-aware and stream-aware; they take into account the popularity of streaming media objects, their bit rate requirements, and the available bandwidth between clients and servers. Using Internet bandwidth models derived from proxy cache logs and measured over real Internet paths, we have conducted extensive simulations to evaluate the performance of various cache management algorithms. Our experiments demonstrate that network-aware caching algorithms can significantly reduce startup delay and improve stream quality. Our experiments also show that partial caching is particularly effective when bandwidth variability is not very high.Shudong Jin: Corespondence to This research was supported in part by NSF (awards ANI-9986397, ANI-0095988, ANI-0205294 and EJA-0202067) and by IBM. Part of this work was done while the first author was at IBM Research in 2001. 相似文献