首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Prefetching is an important technique to reduce the average Web access latency. Existing prefetching methods are based mostly on URL graphs. They use the graphical nature of HTTP links to determine the possible paths through a hypertext system. Although the URL graph-based approaches are effective in prefetching of frequently accessed documents, few of them can prefetch those URLs that are rarely visited. The paper presents a keyword-based semantic prefetching approach to overcome the limitation. It predicts future requests based on semantic preferences of past retrieved Web documents. We apply this technique to Internet news services and implement a client-side personalized prefetching system: NewsAgent. The system exploits semantic preferences by analyzing keywords in URL anchor text of previously accessed documents in different news categories. It employs a neural network model over the keyword set to predict future requests. The system features a self-learning capability and good adaptability to the change of client surfing interest. NewsAgent does not exploit keyword synonymy for conservativeness in prefetching. However, it alleviates the impact of keyword polysemy by taking into account server-provided categorical information in decision-making and, hence, captures more semantic knowledge than term-document literal matching methods. Experimental results from daily browsing of ABC News, CNN, and MSNBC news sites for a period of three months show an achievement of up to 60 percent hit ratio due to prefetching.  相似文献   

2.
序列模式挖掘能够发现隐含在Web日志中的用户的访问规律,可以被用来在Web预取模型中预测即将访问的Web对象。目前大多数序列模式挖掘是基于Apriori的宽度优先算法。提出了基于位图深度优先挖掘算法,采用基于字典树数据结构的深度优先策略,同时采用位图保存和计算各序列的支持度,能够较迅速地挖掘出频繁序列。将该序列模式挖掘算法应用于Web预取模型中,在预取缓存一体化的条件下实验表明具有较好的性能。  相似文献   

3.
Web prefetching is an attractive solution to reduce the network resources consumed by Web services as well as the access latencies perceived by Web users. Unlike Web caching, which exploits the temporal locality, Web prefetching utilizes the spatial locality of Web objects. Specifically, Web prefetching fetches objects that are likely to be accessed in the near future and stores them in advance. In this context, a sophisticated combination of these two techniques may cause significant improvements on the performance of the Web infrastructure. Considering that there have been several caching policies proposed in the past, the challenge is to extend them by using data mining techniques. In this paper, we present a clustering-based prefetching scheme where a graph-based clustering algorithm identifies clusters of “correlated” Web pages based on the users’ access patterns. This scheme can be integrated easily into a Web proxy server, improving its performance. Through a simulation environment, using a real data set, we show that the proposed integrated framework is robust and effective in improving the performance of the Web caching environment.  相似文献   

4.
Web预取技术综述   总被引:11,自引:0,他引:11  
Web预取是减少用户访问延时、提高网络服务质量的关键技术之一,近年来已成为国内外的研究热点.通过利用WWW访问的空间局部性,Web预取使缓存机制从时间局部性向空间局部性扩展.归纳了Web预取技术的分类,概括和比较了不同类别的优势和局限性,给出了预取模型的基本框架及每部分的主要功能,并对各种评价标准进行了详细介绍.同时,深入分析和探讨了现有的几种典型预取算法,系统地比较了这些算法的优缺点.最后从在线性、协作预取、动态流行度、划分用户会话和基于语义与基于路径相结合等方面指出了Web预取技术的研究方向.  相似文献   

5.
Integrating Web Prefetching and Caching Using Prediction Models   总被引:2,自引:0,他引:2  
Yang  Qiang  Zhang  Henry Hanning 《World Wide Web》2001,4(4):299-321
Web caching and prefetching have been studied in the past separately. In this paper, we present an integrated architecture for Web object caching and prefetching. Our goal is to design a prefetching system that can work with an existing Web caching system in a seamless manner. In this integrated architecture, a certain amount of caching space is reserved for prefetching. To empower the prefetching engine, a Web-object prediction model is built by mining the frequent paths from past Web log data. We show that the integrated architecture improves the performance over Web caching alone, and present our analysis on the tradeoff between the reduced latency and the potential increase in network load.  相似文献   

6.
High speed networks and rapidly improving microprocessor performance make the network of workstations an extremely important tool for parallel computing in order to speedup the execution of scientific applications. Shared memory is an attractive programming model for designing parallel and distributed applications, where the programmer can focus on algorithmic development rather than data partition and communication. Based on this important characteristic, the design of systems to provide the shared memory abstraction on physically distributed memory machines has been developed, known as Distributed Shared Memory (DSM). DSM is built using specific software to combine a number of computer hardware resources into one computing environment. Such an environment not only provides an easy way to execute parallel applications, but also combines available computational resources with the purpose of speeding up execution of these applications. DSM systems need to maintain data consistency in memory, which usually leads to communication overhead. Therefore, there exists a number of strategies that can be used to overcome this overhead issue and improve overall performance. Strategies as prefetching have been proven to show great performance in DSM systems, since they can reduce data access communication latencies from remote nodes. On the other hand, these strategies also transfer unnecessary prefetching pages to remote nodes. In this research paper, we focus on the access pattern during execution of a parallel application, and then analyze the data type and behavior of parallel applications. We propose an adaptive data classification scheme to improve prefetching strategy with the goal to improve overall performance. Adaptive data classification scheme classifies data according to the accessing sequence of pages, so that the home node uses past history access patterns of remote nodes to decide whether it needs to transfer related pages to remote nodes. From experimental results, we can observe that our proposed method can increase the accuracy of data access in effective prefetch strategy by reducing the number of page faults and misprefetching. Experimental results using our proposed classification scheme show a performance improvement of about 9–25% over the same benchmark applications running on top of an original JIAJIA DSM system.
Kuan-Ching Li (Corresponding author)Email:
  相似文献   

7.
An SPN-Based Integrated Model for Web Prefetching and Caching   总被引:17,自引:0,他引:17       下载免费PDF全文
The World Wide Web has become the primary means for information dissemination. Due to the limited resources of the network bandwidth, users always suffer from long time waiting. Web prefetching and web caching are the primary approaches to reducing the user perceived access latency and improving the quality of services. In this paper, a Stochastic Petri Nets (SPN) based integrated web prefetching and caching model (IWPCM) is presented and the performance evaluation of IWPCM is made. The performance metrics, access latency, throughput, HR (hit ratio) and BHR (byte hit ratio) are analyzed and discussed. Simulations show that compared with caching only model (CM), IWPCM can further improve the throughput, HR and BHR efficiently and reduce the access latency. The performance evaluation based on the SPN model can provide a basis for implementation of web prefetching and caching and the combination of web prefetching and caching holds the promise of improving the QoS of web systems.  相似文献   

8.
移动环境下支持实时事务处理的数据预取   总被引:5,自引:0,他引:5  
随着移动通信技术的迅速发展,人们提出了新的应用要求:在移动环境下处理实时事务.而移动通信带宽有限性引起较大的数据访问延迟,有时甚至由于网络传输的断接使得事务得不到所需要的数据,数据预取能够很好地解决这个问题.已有的移动环境下数据预取没有考虑到数据的流行性和事务的时间特性.该文分析影响实时事务数据预取的因素,首先考虑数据易变性、活跃性等因素,获得高价值预取数据集合;然后考虑访问预取数据的事务优先级、数据流行性等因素,构造预取数据的选择函数,通过该函数在前面选取的集合中筛选出对满足实时事务截止期更有价值的数据对象进行预取.实验表明,该数据预取策略能降低移动实时事务满足截止期的比率,更好地支持移动实时事务处理.  相似文献   

9.
Web预取模型分析   总被引:1,自引:0,他引:1  
WWW的快速增长导致网络拥塞和服务器超载。缓存技术被认为是减轻服务器负载、减少网络拥塞、降低客户访问延迟的有效途径之一,但作用有限。为进一步提高WWW性能,引入了预取技术。文中首先介绍了Web预取技术的基本思想及其研究可行性,然后分析了现有Web预取模型,最后给出了一个Web预取模型应具有的关键属性。  相似文献   

10.
In this paper we propose and evaluate a new data-prefetching technique for cache coherent multiprocessors. Prefetches are issued by a functional unit called a prefetch engine which is controlled by the compiler. We let second-level cache misses generate cache miss traps and start the prefetch engine in a trap handler. The trap handler is fast (40–50 cycles) and does not normally delay the program beyond the memory latency of the miss. Once started, the prefetch engine executes on its own and causes no instruction overhead. The only instruction overhead in our approach is when a trap handler completes after data arrives. The advantages of this technique are (1) it exploits static compiler analysis to determine what to prefetch, which is hard to do in hardware, (2) it uses prefetching with very little instruction overhead, which is a limitation for traditional software-controlled prefetching, and (3) it is accurate in the sense that it generates very little useless traffic while maintaining a high prefetching coverage. We also study whether one could emulate the prefetch engine in software, which would not require any additional hardware beyond support for generating cache miss traps and ordinary prefetch instructions. In this paper we present the functionality of the prefetch engine and a compiler algorithm to control it. We evaluate our technique on six parallel scientific and engineering applications using an optimizing compiler with our algorithm and a simulated multiprocessor. We find that the prefetch engine removes up to 67% of the memory access stall time at an instruction overhead less than 0.42%. The emulated prefetch engine removes in general less stall time at a higher instruction overhead.  相似文献   

11.
缓存和预取在提高无线环境下的Web访问性能方面发挥着重要作用。文章研究针对无线局域网的Web缓存和预取机制,分别基于数据挖掘和信息论提出了采用序列挖掘和延迟更新的预测算法,设计了上下文感知的预取算法和获益驱动的缓存替换机制,上述算法已在Web缓存系统OnceEasyCache中实现。性能评估实验结果表明,上述算法的集成能有效地提高缓存命中率和延迟节省率。  相似文献   

12.
Predictive Prefetching on the Web and Its Potential Impact in the Wide Area   总被引:2,自引:0,他引:2  
The rapid increase of World Wide Web users and the development of services with high bandwidth requirements have caused the substantial increase of response times for users on the Internet. Web latency would be significantly reduced, if browser, proxy or Web server software could make predictions about the pages that a user is most likely to request next, while the user is viewing the current page, and prefetch their content.In this paper we study Predictive Prefetching on a totally new Web system architecture. This is a system that provides two levels of caching before information reaches the clients. This work analyses prefetching on a Wide Area Network with the above mentioned characteristics. We first provide a structured overview of predictive prefetching and show its wide applicability to various computer systems. The WAN that we refer to is the GRNET academic network in Greece. We rely on log files collected at the network's Transparent cache (primary caching point), located at GRNET's edge connection to the Internet. We present the parameters that are most important for prefetching on GRNET's architecture and provide preliminary results of an experimental study, quantifying the benefits of prefetching on the WAN. Our experimental study includes the evaluation of two prediction algorithms: an n most popular document algorithm and a variation of the PPM (Prediction by Partial Matching) prediction algorithm. Our analysis clearly shows that Predictive prefetching can improve Web response times inside the GRNET WAN without substantial increase in network traffic due to prefetching.  相似文献   

13.
基于主观贝叶斯方法的数据预取技术   总被引:1,自引:0,他引:1  
随着信息技术的迅速发展,网络规模随着用户数呈指数级增长,也越来越庞大,要提高用户的访问速度,主要采用缓存和预取技术来减少网络延迟,提出了一种智能的预取方案.该方案使用模糊匹配来计算用户对页面的访问概率,用主观贝叶斯方法计算网页的重要性,同时控制预取的量和预取的时刻,以避免对网络的性能产生负面影响.  相似文献   

14.
一种智能的预取算法   总被引:1,自引:0,他引:1  
网络延迟问题是用户QoS的主要问题之一,它依赖诸多因素如网络带宽、传输延迟、排队延迟和客户机及服务器的处理速度。目前主要采用缓存和预取技术来减少网络延迟,但缓存技术所提高的缓存代理服务器的命中率是有限的。该文系统地阐述了目前预取算法的基本思想并把它们分成四类:基于流行度、基于交互、基于访问概率和基于数据挖掘的预取算法。在对它们进行分析比较的基础上,提出了一种智能的预取方案。该方案使用模糊匹配来计算用户对页面的访问概率,同时要控制预取的量和预取的时刻,以避免对网络的性能产生负面影响。  相似文献   

15.
Web预取技术的研究   总被引:1,自引:0,他引:1  
预取技术是提高缓存命中率和解决Web访问延迟问题的主要方案,本文研究了网页预取技术,将数据挖掘应用于Web预取中,设计了一个为用户提供个性化服务的Web预取模型;详细介绍了对Web日志进行预处理的方法;提出了新的预取替换算法。  相似文献   

16.
Users of a Web site usually perform their interest-oriented actions by clicking or visiting Web pages, which are traced in access log files. Clustering Web user access patterns may capture common user interests to a Web site, and in turn, build user profiles for advanced Web applications, such as Web caching and prefetching. The conventional Web usage mining techniques for clustering Web user sessions can discover usage patterns directly, but cannot identify the latent factors or hidden relationships among users?? navigational behaviour. In this paper, we propose an approach based on a vector space model, called Random Indexing, to discover such intrinsic characteristics of Web users?? activities. The underlying factors are then utilised for clustering individual user navigational patterns and creating common user profiles. The clustering results will be used to predict and prefetch Web requests for grouped users. We demonstrate the usability and superiority of the proposed Web user clustering approach through experiments on a real Web log file. The clustering and prefetching tasks are evaluated by comparison with previous studies demonstrating better clustering performance and higher prefetching accuracy.  相似文献   

17.
The web resources in the World Wide Web are rising, to large extent due to the services and applications provided by it. Because web traffic is large, gaining access to these resources incurs user-perceived latency. Although the latency can never be avoided, it can be minimized to a larger extent. Web prefetching is identified as a technique that anticipates the user’s future requests and fetches them into the cache prior to an explicit request made. Because web objects are of various types, a new algorithm is proposed that concentrates on prefetching embedded objects, including audio and video files. Further, clustering is employed using adaptive resonance theory (ART)2 in order to prefetch embedded objects as clusters. For comparative study, the web objects are clustered using ART2, ART1, and other statistical techniques. The clustering results confirm the supremacy of ART2 and, thereby, prefetching web objects in clusters is observed to produce a high hit rate.  相似文献   

18.
Both hardware and software prefetching have been shown to be effective in tolerating the large memory latencies inherent in shared-memory multiprocessors; however, both types of prefetching have their shortcomings. While software schemes require less hardware support than hardware schemes, they must generate address calculation instructions and a prefetch instruction for each datum that needs to be prefetched. Hardware schemes, however, must become progressively more complex to be able to compute data access strides and to increase the prefetching lookahead. In this paper, we propose an integrated hardware/software prefetching method that uses simple hardware that can handle most data accesses and software prefetching for the few remaining accesses. A compile time algorithm analyzes the access streams formed by array references and determines sequences of consecutive memory accesses to an access stream that can be prefetched by the hardware mechanism. This analysis is based on the relative memory locations of consecutive accesses to an access stream and the number of intervening data references between consecutive accesses to an access stream. In addition, the prefetching lookahead can be set separately for each access stream. Our approach yields an effective scheme that minimizes both CPU overhead and hardware costs. Execution-driven simulations show our method to be very effective.  相似文献   

19.
Adaptive leases: a strong consistency mechanism for the World Wide Web   总被引:2,自引:0,他引:2  
We argue that weak cache consistency mechanisms supported by existing Web proxy caches must be augmented by strong consistency mechanisms to support the growing diversity in application requirements. Existing strong consistency mechanisms are not appealing for Web environments due to their large state space or control message overhead. We focus on the lease approach that balances these trade-offs and present analytical models and policies for determining the optimal lease duration. We present extensions to the HTTP protocol to incorporate leases and then implement our techniques in the Squid proxy cache and the Apache Web server. Our experimental evaluation of the leases approach shows that: 1) our techniques impose modest overheads even for long leases (a lease duration of 1 hour requires state to be maintained for 1030 leases and imposes an per-object overhead of a control message every 33 minutes), 2) leases yields a 138-425 percent improvement over existing strong consistency mechanisms, and 3) the implementation overhead of leases is comparable to existing weak consistency mechanisms.  相似文献   

20.
A data mining algorithm for generalized Web prefetching   总被引:8,自引:0,他引:8  
Predictive Web prefetching refers to the mechanism of deducing the forthcoming page accesses of a client based on its past accesses. In this paper, we present a new context for the interpretation of Web prefetching algorithms as Markov predictors. We identify the factors that affect the performance of Web prefetching algorithms. We propose a new algorithm called WM,,, which is based on data mining and is proven to be a generalization of existing ones. It was designed to address their specific limitations and its characteristics include all the above factors. It compares favorably with previously proposed algorithms. Further, the algorithm efficiently addresses the increased number of candidates. We present a detailed performance evaluation of WM, with synthetic and real data. The experimental results show that WM/sub o/ can provide significant improvements over previously proposed Web prefetching algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号