Caching for Web Searching期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Caching for Web Searching

Authors:	Kalyanasundaram Noga Pruhs Woeginger

Affiliation:	(1) Computer Science Department, Georgetown University, Washington, DC 20057-1232, USA. kalyan@cs.georgetown.edu., US;(2) Department of Mathematics, Technical University of Graz, Steyrergasse 30, A-8010 Graz, Austria. {noga,gwoegi}@opt.math.tu-graz.ac.at., AT;(3) Computer Science Department, University of Pittsburgh, Pittsburgh, PA 15260, USA. kirk@cs.pitt.edu., US

Abstract:	Abstract. We study Web Caching when the input sequence is a depth first search traversal of some tree. There are at least two good motivations for investigating tree traversal as a search technique on the WWW: First, empirical studies of people browsing and searching the WWW have shown that user access patterns commonly are nearly depth first traversals of some tree. Secondly (as we will show in this paper), the problem of visiting all the pages on some WWW site using anchor clicks (clicks on links) and back button clicks—by far the two most common user actions—reduces to the problem of how best to cache a tree traversal sequence (up to constant factors). We show that for tree traversal sequences the optimal offline strategy can be computed efficiently. In the bit model, where the access time of a page is proportional to its size, we show that the online algorithm LRU is (1 + 1/ɛ) -competitive against an adversary with unbounded cache as long as LRU has a cache of size at least (1+ ɛ) times the size of the largest item in the input sequence. In the general model, where pages have arbitrary access times and sizes, we show that in order to be constant competitive, any online algorithm needs a cache large enough to store Ω(log n) pages; here n is the number of distinct pages in the input sequence. We provide a matching upper bound by showing that the online algorithm Landlord is constant competitive against an adversary with an unbounded cache if Landlord has a cache large enough to store the Ω(log n) largest pages. This is further theoretical evidence that Landlord is the ``right' algorithm for Web Caching.

Keywords:	, Web Caching, Greedy-Dual-Size, Web searching, Landlord, Least recently used, LRU,
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏