首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
序列模式挖掘能够发现隐含在Web日志中的用户的访问规律,可以被用来在Web预取模型中预测即将访问的Web对象。目前大多数序列模式挖掘是基于Apriori的宽度优先算法。提出了基于位图深度优先挖掘算法,采用基于字典树数据结构的深度优先策略,同时采用位图保存和计算各序列的支持度,能够较迅速地挖掘出频繁序列。将该序列模式挖掘算法应用于Web预取模型中,在预取缓存一体化的条件下实验表明具有较好的性能。  相似文献   

2.
缓存和预取在提高无线环境下的Web访问性能方面发挥着重要作用。文章研究针对无线局域网的Web缓存和预取机制,分别基于数据挖掘和信息论提出了采用序列挖掘和延迟更新的预测算法,设计了上下文感知的预取算法和获益驱动的缓存替换机制,上述算法已在Web缓存系统OnceEasyCache中实现。性能评估实验结果表明,上述算法的集成能有效地提高缓存命中率和延迟节省率。  相似文献   

3.
Integrating Web caching and Web prefetching in client-side proxies   总被引:2,自引:0,他引:2  
Web caching and Web prefetching are two important techniques used to reduce the noticeable response time perceived by users. Note that by integrating Web caching and Web prefetching, these two techniques can complement each other since the Web caching technique exploits the temporal locality, whereas Web prefetching technique utilizes the spatial locality of Web objects. However, without circumspect design, the integration of these two techniques might cause significant performance degradation to each other. In view of this, we propose in this paper an innovative cache replacement algorithm, which not only considers the caching effect in the Web environment, but also evaluates the prefetching rules provided by various prefetching schemes. Specifically, we formulate a normalized profit function to evaluate the profit from caching an object (i.e., either a nonimplied object or an implied object according to some prefetching rule). Based on the normalized profit function devised, we devise an innovative Web cache replacement algorithm, referred to as Algorithm IWCP (standing for the Integration of Web Caching and Prefetching). Using an event-driven simulation, we evaluate the performance of Algorithm IWCP under several circumstances. The experimental results show that Algorithm IWCP consistently outperforms the companion schemes in various performance metrics.  相似文献   

4.
Integrating Web Prefetching and Caching Using Prediction Models   总被引:2,自引:0,他引:2  
Yang  Qiang  Zhang  Henry Hanning 《World Wide Web》2001,4(4):299-321
Web caching and prefetching have been studied in the past separately. In this paper, we present an integrated architecture for Web object caching and prefetching. Our goal is to design a prefetching system that can work with an existing Web caching system in a seamless manner. In this integrated architecture, a certain amount of caching space is reserved for prefetching. To empower the prefetching engine, a Web-object prediction model is built by mining the frequent paths from past Web log data. We show that the integrated architecture improves the performance over Web caching alone, and present our analysis on the tradeoff between the reduced latency and the potential increase in network load.  相似文献   

5.
An SPN-Based Integrated Model for Web Prefetching and Caching   总被引:17,自引:0,他引:17       下载免费PDF全文
The World Wide Web has become the primary means for information dissemination. Due to the limited resources of the network bandwidth, users always suffer from long time waiting. Web prefetching and web caching are the primary approaches to reducing the user perceived access latency and improving the quality of services. In this paper, a Stochastic Petri Nets (SPN) based integrated web prefetching and caching model (IWPCM) is presented and the performance evaluation of IWPCM is made. The performance metrics, access latency, throughput, HR (hit ratio) and BHR (byte hit ratio) are analyzed and discussed. Simulations show that compared with caching only model (CM), IWPCM can further improve the throughput, HR and BHR efficiently and reduce the access latency. The performance evaluation based on the SPN model can provide a basis for implementation of web prefetching and caching and the combination of web prefetching and caching holds the promise of improving the QoS of web systems.  相似文献   

6.
Web预取技术综述   总被引:11,自引:0,他引:11  
Web预取是减少用户访问延时、提高网络服务质量的关键技术之一,近年来已成为国内外的研究热点.通过利用WWW访问的空间局部性,Web预取使缓存机制从时间局部性向空间局部性扩展.归纳了Web预取技术的分类,概括和比较了不同类别的优势和局限性,给出了预取模型的基本框架及每部分的主要功能,并对各种评价标准进行了详细介绍.同时,深入分析和探讨了现有的几种典型预取算法,系统地比较了这些算法的优缺点.最后从在线性、协作预取、动态流行度、划分用户会话和基于语义与基于路径相结合等方面指出了Web预取技术的研究方向.  相似文献   

7.
A Data Cube Model for Prediction-Based Web Prefetching   总被引:7,自引:0,他引:7  
Reducing the web latency is one of the primary concerns of Internet research. Web caching and web prefetching are two effective techniques to latency reduction. A primary method for intelligent prefetching is to rank potential web documents based on prediction models that are trained on the past web server and proxy server log data, and to prefetch the highly ranked objects. For this method to work well, the prediction model must be updated constantly, and different queries must be answered efficiently. In this paper we present a data-cube model to represent Web access sessions for data mining for supporting the prediction model construction. The cube model organizes session data into three dimensions. With the data cube in place, we apply efficient data mining algorithms for clustering and correlation analysis. As a result of the analysis, the web page clusters can then be used to guide the prefetching system. In this paper, we propose an integrated web-caching and web-prefetching model, where the issues of prefetching aggressiveness, replacement policy and increased network traffic are addressed together in an integrated framework. The core of our integrated solution is a prediction model based on statistical correlation between web objects. This model can be frequently updated by querying the data cube of web server logs. This integrated data cube and prediction based prefetching framework represents a first such effort in our knowledge.  相似文献   

8.
一种智能的预取算法   总被引:1,自引:0,他引:1  
网络延迟问题是用户QoS的主要问题之一,它依赖诸多因素如网络带宽、传输延迟、排队延迟和客户机及服务器的处理速度。目前主要采用缓存和预取技术来减少网络延迟,但缓存技术所提高的缓存代理服务器的命中率是有限的。该文系统地阐述了目前预取算法的基本思想并把它们分成四类:基于流行度、基于交互、基于访问概率和基于数据挖掘的预取算法。在对它们进行分析比较的基础上,提出了一种智能的预取方案。该方案使用模糊匹配来计算用户对页面的访问概率,同时要控制预取的量和预取的时刻,以避免对网络的性能产生负面影响。  相似文献   

9.
根据Web缓存流量访问特征建立数学模型,设计实现了Web缓存流量特征模拟生成器(WebSimGen)。利用两层代理缓存结构、基于ADF(Aggregation、Disaggregation和Filtering)模型对Web缓存流量的访问特征和性能进行测试,实验表明模拟日志具有和真实日志类似的访问特性。Web生成器具有较大的灵活性,能够克服真实日志的一些缺点,为进一步提高Web缓存性能和预取技术提供了重要依据。  相似文献   

10.
Web访问特征模型建模是进行有效Web缓存管理的基础。该文根据Web访问的四个典型特征建立综合的数学模型,实现了Web访问特征建模生成器(WebGenM),实验表明模拟器能较好地模拟网络访问流的特征,而且易于使用,具有较大的灵活性,其为进一步的Web缓存和预取技术的研究提供了重要依据。  相似文献   

11.
Web缓存命中率与字节命中率关系   总被引:2,自引:0,他引:2       下载免费PDF全文
在研究Web缓存性能时,一般考虑2个评价指标:命中率HR和字节命中率BHR。目前大多侧重于2个指标之一,或仅通过测试2个指标的数值来评价缓存替换算法优劣,没有从2个指标关系的角度来评价缓存替换算法的性能。该文讨论了Web缓存系统中命中率与字节命中率之间的关系,提出了一种Web缓存性能评价指标——命中比(FBR),讨论了该指标在Web缓存替换算法及Web预取性能评价中的应用,为度量缓存系统的性能提供了参考依据。  相似文献   

12.
Web对象访问特征模拟器的设计与实现   总被引:2,自引:0,他引:2  
石磊  陶永才 《计算机仿真》2006,23(1):133-136
Web缓存是一个提高Web性能非常有效的方法,它可以位于网络的不同位置:客户端,代理服务器端,服务器端。研究表明Web缓存命中率可以达到30%-50%。Web缓存在应用中最大的问题就是Web缓存管理,研究Web访问特征是有效进行Web缓存管理的基础。Web日志生成模拟器对于研究Web缓存系统有很大地帮助,目前有两种方法模拟生成Web访问日志:日志驱动方法,数学模拟方法。日志驱动方法利用对历史日志进行变换来模拟生成新的日志,数学模拟方法在充分研究Ⅵ协对象访问特征的基础上,通过建立数学模型来模拟生成Web日志。该文通过分析Web对象访问特征,采用数学模拟方法分别模拟了Web对象高频区及低频区流行度特征,Web对象大小重尾分布特征,Web访问的时间局部性特征;设计并实现了一个Web日志模拟生成器WEBSIM。该模拟器不仅可以模拟生成Web对象访问日志,而且具有较大的灵活性,为进一步研究Web缓存技术和预取技术提供依据。  相似文献   

13.
Web caching has been proposed as an effective solution to the problems of network traffic and congestion, Web objects access and Web load balancing. This paper presents a model for optimizing Web cache content by applying either a genetic algorithm or an evolutionary programming scheme for Web cache content replacement. Three policies are proposed for each of the genetic algorithm and the evolutionary programming techniques, in relation to objects staleness factors and retrieval rates. A simulation model is developed and long term trace-driven simulation is used to experiment on the proposed techniques. The results indicate that all evolutionary techniques are beneficial to the cache replacement, compared to the conventional replacement applied in most Web cache server. Under an appropriate objective function the genetic algorithm has been proven to be the best of all approaches with respect to cache hit and byte hit ratios.  相似文献   

14.
Users of a Web site usually perform their interest-oriented actions by clicking or visiting Web pages, which are traced in access log files. Clustering Web user access patterns may capture common user interests to a Web site, and in turn, build user profiles for advanced Web applications, such as Web caching and prefetching. The conventional Web usage mining techniques for clustering Web user sessions can discover usage patterns directly, but cannot identify the latent factors or hidden relationships among users?? navigational behaviour. In this paper, we propose an approach based on a vector space model, called Random Indexing, to discover such intrinsic characteristics of Web users?? activities. The underlying factors are then utilised for clustering individual user navigational patterns and creating common user profiles. The clustering results will be used to predict and prefetch Web requests for grouped users. We demonstrate the usability and superiority of the proposed Web user clustering approach through experiments on a real Web log file. The clustering and prefetching tasks are evaluated by comparison with previous studies demonstrating better clustering performance and higher prefetching accuracy.  相似文献   

15.
Web预取模型分析   总被引:1,自引:0,他引:1  
WWW的快速增长导致网络拥塞和服务器超载。缓存技术被认为是减轻服务器负载、减少网络拥塞、降低客户访问延迟的有效途径之一,但作用有限。为进一步提高WWW性能,引入了预取技术。文中首先介绍了Web预取技术的基本思想及其研究可行性,然后分析了现有Web预取模型,最后给出了一个Web预取模型应具有的关键属性。  相似文献   

16.
针对搜索引擎查询结果缓存与预取问题,该文提出了一种基于查询特性的搜索引擎查询结果缓存与预取方法,该方法包括用来指导预取的查询结果页码预测模型和缓存与预取算法框架,用于提高搜索引擎系统性能。通过对国内某著名中文商业搜索引擎的某段时间的用户查询日志分析得出,用户对不同查询返回的查询结果所浏览的页数具有显著的非均衡性,结合该特性设计查询结果页码预测模型来进行预取和分区缓存。在该搜索引擎两个月的大规模真实用户查询日志上的实验结果表明,与传统的方法相比,该方法可以获得3.5%~8.45%的缓存命中率提升。  相似文献   

17.
查询结果缓存可以对查询结果的文档标识符集合或者实际的返回页面进行缓存,以提高用户查询的响应速度,相应的缓存形式可以分别称之为标识符缓存或页面缓存。对于固定大小的内存,标识符缓存可以获得更高的命中率,而页面缓存可以达到更高的响应速度。该文根据用户查询访问的时间局部性和空间局部性,提出了一种新颖的基于时空局部性的层次化结果缓存机制。首先,该机制将固定大小的结果缓存划分为两层:页面缓存和标识符缓存。对于用户提交的查询,该机制会首先使用第一层的页面缓存进行应答,如果未能命中,则继续尝试使用第二层的标识符缓存。实验显示这种层次化的缓存机制较传统的仅依赖于单一缓存形式的机制,在平均查询响应时间上,取得了可观的性能提升:例如,相对单纯的页面缓存,平均达到9%,最好情况下达到11%。其次,该机制在标识符缓存的基础上,设计了一种启发式的预取策略,对用户查询检索的空间局部性进行挖掘。实验显示,这种预取策略的融合,能进一步促进检索系统性能的有效提升,从而最终建立起一套时空完备的、有效的结果缓存机制。  相似文献   

18.
现有的Web缓存器的实现主要是基于传统的内存缓存算法,由于Web业务请求的异质性,传统的替换算法不能在Web环境中有效工作。研究了Web缓存替换操作的依据,分析了以往替换算法的不足,考虑到Web文档的大小、访问代价、访问频率、访问兴趣度以及最近一次被访问的时间对缓存替换的影响,提出了Web缓存对象角色的概念,建立了一种新的基于对象角色的高精度Web缓存替换算法(ORB算法);并以NASA和DEC的代理服务器数据为例,将该算法与LRU、LFU、SIZE、Hybrid算法进行了仿真实验对比,结果证明,ORB算  相似文献   

19.
Vakali  Athena 《World Wide Web》2001,4(4):277-297
Accesing and circulation of Web objects has been facilitated by the design and implementation of effective caching schemes. Web caching has been integrated in prototype and commercial Web-based information systems in order to reduce the overall bandwidth and increase system's fault tolerance. This paper presents an overview of a series of Web cache replacement algorithms based on the idea of preserving a history record for cached Web objects. The number of references to Web objects over a certain time period is a critical parameter for the cache content replacement. The proposed algorithms are simulated and experimented under a real workload of Web cache traces provided by a major (Squid) proxy cache server installation. Cache and bytes hit rates are given with respect to different cache sizes and a varying number of request workload sets and it is shown that the proposed cache replacement algorithms improve both cache and byte hit rates.  相似文献   

20.
This paper presents a Page rank-based prefetching technique for accesses to Web page clusters. The approach uses the link structure of a requested page to determine the “most important” linked pages and to identify the page(s) to be prefetched. The underlying premise of our approach is that in the case of cluster accesses, the next pages requested by users of the Web server are typically based on the current and previous pages requested. Furthermore, if the requested pages have a lot of links to some “important” page, that page has a higher probability of being the next one requested. An experimental evaluation of the prefetching mechanism is presented using real server logs. The results show that the Page rank-based scheme does better than random prefetching for clustered accesses, with hit rates of 90% in some cases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号