缓存和预取在提高无线环境下的Web访问性能方面发挥着重要作用。文章研究针对无线局域网的Web缓存和预取机制,分别基于数据挖掘和信息论提出了采用序列挖掘和延迟更新的预测算法,设计了上下文感知的预取算法和获益驱动的缓存替换机制,上述算法已在Web缓存系统OnceEasyCache中实现。性能评估实验结果表明,上述算法的集成能有效地提高缓存命中率和延迟节省率。  相似文献   

刘金  胡创  胡明  龚奕利 《计算机应用》2012,32(6):1713-1716
为解决当前Linux内核的预取算法在多线程情况下出现预取误判的问题,依据多线程环境下进程对磁盘文件的访问特点,提出一种基于多预取点的预取算法。在Linux内核原有的预取算法的基础上,结合多线程环境下应用程序对数据的访问模式,在Linux内核的页面缓存层进行了实现。实验和分析表明,在IOzone单线程测试中,该算法和Linux内核原预取算法性能相当;在多线程测试中,读取相同大小的文件,耗时比Linux内核原预取算法至少少1/3。新算法对于提高I/O并行度,从而提高整个计算机系统并行化很有帮助。  相似文献   

结合现有的代理缓存策略和传输方案,针对现有的网络条件,提出了一种自适应的分段方法,解决了已有方法对于流媒体对象流行性的变化和用户访问模式的不确定缺乏自身调整能力的缺欠,和一种优化的传输方案,采用了单播和多播相结合, 主动预取和补丁传输相结合的方法,对于缩短启动延时、提高字节命中率以及节省骨干网带宽等方面取得了较明显的效果.  相似文献   

We present an improved online algorithm for coloring interval graphs with bandwidth. This problem has recently been studied by Adamy and Erlebach and a 195-competitive online strategy has been presented. We improve this by presenting a 10-competitive strategy. To achieve this result, we use variants of an optimal online coloring algorithm due to Kierstead and Trotter.  相似文献   

The disk dimension of a planar graph G is the least number k for which G embeds in the plane minus k open disks, with every vertex on the boundary of some disk. Useful properties of graphs with a given disk dimension are derived, leading to an algorithm to obtain an outerplanar subgraph of a graph with disk dimension k by removing at most 2k−2 vertices. This reduction is used to obtain linear-time exact and approximation algorithms on graphs with fixed disk dimension. In particular, a linear-time approximation algorithm is presented for the pathwidth problem.  相似文献   

Large multimedia document archives may hold a major fraction of their data in tertiary storage libraries for cost reasons. This paper develops an integrated approach to the vertical data migration between the tertiary, secondary, and primary storage in that it reconciles speculative prefetching, to mask the high latency of the tertiary storage, with the replacement policy of the document caches at the secondary and primary storage level, and also considers the interaction of these policies with the tertiary and secondary storage request scheduling. The integrated migration policy is based on a continuous-time Markov chain model for predicting the expected number of accesses to a document within a specified time horizon. Prefetching is initiated only if that expectation is higher than those of the documents that need to be dropped from secondary storage to free up the necessary space. In addition, the possible resource contention at the tertiary and secondary storage is taken into account by dynamically assessing the response-time benefit of prefetching a document versus the penalty that it would incur on the response time of the pending document requests. The parameters of the continuous-time Markov chain model, the probabilities of co-accessing certain documents and the interaction times between successive accesses, are dynamically estimated and adjusted to evolving workload patterns by keeping online statistics. The integrated policy for vertical data migration has been implemented in a prototype system. The system makes profitable use of the Markov chain model also for the scheduling of volume exchanges in the tertiary storage library. Detailed simulation experiments with Web-server-like synthetic workloads indicate significant gains in terms of client response time. The experiments also show that the overhead of the statistical bookkeeping and the computations for the access predictions is affordable. Received January 1, 1998 / Accepted May 27, 1998  相似文献   

Sequential prefetching schemes are widely employed in storage servers to mask disk latency and improve system throughput. However, existing schemes cannot benefit parallel disk systems as expected due to the fact that they ignore the distinct internal characteristics of the parallel disk system, in particular, data striping. Moreover, their aggressive prefetching pattern suffers from premature evictions and prolonged request latencies. In this paper, we propose a strip-oriented asynchronous prefetching (SoAP) technique, which is dedicated to the parallel disk system. It settles the above-mentioned problems by providing multiple novel features, e.g., enhanced prediction accuracy, adaptive prefetching strength, physical data layout awareness, and timely prefetching. To validate SoAP, we implement a prototype by modifying the software redundant arrays of inexpensive disks (RAID) under Linux. Experimental results demonstrate that SoAP can consistently offer improved average response time and throughput to the parallel disk system under non-random workloads compared with STEP, SP, ASP, and Linux-like SEQPs.  相似文献   

We prove upper and lower bounds on the competitiveness of randomized algorithms for the list update problem of Sleator and Tarjan. We give a simple and elegant randomized algorithm that is more competitive than the best previous randomized algorithm due to Irani. Our algorithm uses randomness only during an initialization phase, and from then on runs completely deterministically. It is the first randomized competitive algorithm with this property to beat the deterministic lower bound. We generalize our approach to a model in which access costs are fixed but update costs are scaled by an arbitrary constantd. We prove lower bounds for deterministic list update algorithms and for randomized algorithms against oblivious and adaptive on-line adversaries. In particular, we show that for this problem adaptive on-line and adaptive off-line adversaries are equally powerful.A preliminary version of these results appeared in a joint paper with S. Irani in theProceedings of the 2nd Symposium on Discrete Algorithms, 1991 [17].This research was partially supported by NSF Grants CCR-8808949 and CCR-8958528.This research was partially supported by NSF Grant CCR-9009753.This research was supported in part by the National Science Foundation under Grant CCR-8658139, by DIMACS, a National Science Foundation Science and Technology center, Grant No. NSF-STC88-09648.  相似文献   

Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. Parallel disk I/O subsystems have been proposed as one way to close the gap between processor and disk speeds. In a previous paper we showed that prefetching and caching have thepotential to deliver the performance benefits of parallel file systems to parallel applications. In this paper we describe experiments withpractical prefetching policies that base decisions only on on-line reference history, and that can be implemented efficiently. We also test the ability of those policies across a range of architectural parameters.  相似文献   

Batching has been studied extensively in the offline case, but applications such as manufacturing or TCP acknowledgment often require online solutions.We consider online batching problems, where the order of jobs to be batched is fixed and where we seek to minimize the sum of the completion times of the jobs. We present optimally competitive online algorithms for both s-batch and p-batch problems, and we also derive results for certain naturally occurring special cases, such as the case of unit processing times.  相似文献   

In this paper, we consider the following question: what is the worst possible page-replacement strategy? Our goal is to devise an online strategy that has the highest possible fraction of page faults as compared to the worst offline strategy. We show that there is no deterministic, online page-replacement strategy that is competitive with the worst offline strategy. We give a randomized strategy based on the “most-recently-used” heuristic and show that this strategy is the worst possible online page-replacement strategy.  相似文献   

We present a new parallel algorithm for computing a maximum cardinality matching in a bipartite graph suitable for distributed memory computers.The presented algorithm is based on the Push-Relabel algorithm which is known to be one of the fastest algorithms for the bipartite matching problem. Previous attempts at developing parallel implementations of it have focused on shared memory computers using only a limited number of processors.We first present a straightforward adaptation of these shared memory algorithms to distributed memory computers. However, this is not a viable approach as it requires too much communication. We then develop our new algorithm by modifying the previous approach through a sequence of steps with the main goal being to reduce the amount of communication and to increase load balance. The first goal is achieved by changing the algorithm so that many push and relabel operations can be performed locally between communication rounds and also by selecting augmenting paths that cross processor boundaries infrequently. To achieve good load balance, we limit the speed at which global relabelings traverse the graph. In several experiments on a large number of instances, we study weak and strong scalability of our algorithm using up to 128 processors.The algorithm can also be used to find ?-approximate matchings quickly.  相似文献   

We consider the following problem of scheduling with conflicts (swc): Find a minimum makespan schedule on identical machines where conflicting jobs cannot be scheduled concurrently. We study the problem when conflicts between jobs are modeled by general graphs. Our first main positive result is an exact algorithm for two machines and job sizes in {1,2}. For jobs sizes in {1,2,3}, we can obtain a -approximation, which improves on the -approximation that was previously known for this case. Our main negative result is that for jobs sizes in {1,2,3,4}, the problem is APX-hard. Our second contribution is the initiation of the study of an online model for swc, where we present the first results in this model. Specifically, we prove a lower bound of on the competitive ratio of any deterministic online algorithm for m machines and unit jobs, and an upper bound of 2 when the algorithm is not restricted computationally. For three machines we can show that an efficient greedy algorithm achieves this bound. For two machines we present a more complex algorithm that achieves a competitive ratio of when the number of jobs is known in advance to the algorithm.  相似文献   

Genetic algorithms for approximate similarity queries   总被引:1,自引:0,他引:1  
Algorithms to query large sets of simple data (composed of numbers and small character strings) are constructed to retrieve the exact answer, retrieving every relevant element, so the answer said to be exact. Similarity searching over complex data is much more expensive than searching over simple data. Moreover, comparison operations over complex data usually consider features extracted from each element, instead of the elements themselves. Thus, even if an algorithm retrieves an exact answer, it is ‘exact’ regarding the extracted features, not regarding the original elements themselves. Therefore, trading exact answering with query time response can be worthwhile. In this work we developed two search strategies based on genetic algorithms to allow retrieving approximate data indexed by Metric Access Methods (MAM) within a limited, user-defined, amount of time. These strategies allow implementing algorithms to answer both range and k-nearest neighbor queries, and allow also to estimate the precision obtained for the approximate answer. Experimental evaluation shows that very good results (corresponding to what the user would expect) can be obtained in a fraction of the time required to obtain the exact answer.  相似文献   

由于链式数据结构的存储缺乏空间局部性,导致程序执行过程中对链式数据的访问会发生严重的Cache缺失行为。通过对面向链式结构的线程预取性能分析,研究链式数据结构程序热点循环的计算任务量与访存任务量比例特征对线程预取性能的影响。结合多核处理器平台特点,实现了一种适用于链式数据结构的帮助线程间隔预取方法。实验结果进一步验证了计算任务量与访存任务量比例特征对间隔预取性能的影响,表明间隔预取相比于传统线程预取技术有明显的性能优势。  相似文献   

We investigate the push-relabel algorithm for solving the problem of finding a maximum cardinality matching in a bipartite graph in the context of the maximum transversal problem. We describe in detail an optimized yet easy-to-implement version of the algorithm and fine-tune its parameters. We also introduce new performance-enhancing techniques. On a wide range of real-world instances, we compare the push-relabel algorithm with state-of-the-art algorithms based on augmenting paths and pseudoflows. We conclude that a carefully tuned push-relabel algorithm is competitive with all known augmenting path-based algorithms, and superior to the pseudoflow-based ones.  相似文献   

Multiple memory models have been proposed to capture the effects of memory hierarchy culminating in the I-O model of Aggarwal and Vitter (Commun. ACM 31(9):1116–1127, [1988]). More than a decade of architectural advancements have led to new features that are not captured in the I-O model—most notably the prefetching capability. We propose a relatively simple Prefetch model that incorporates data prefetching in the traditional I-O models and show how to design optimal algorithms that can attain close to peak memory bandwidth. Unlike (the inverse of) memory latency, the memory bandwidth is much closer to the processing speed, thereby, intelligent use of prefetching can considerably mitigate the I-O bottleneck. For some fundamental problems, our algorithms attain running times approaching that of the idealized random access machines under reasonable assumptions. Our work also explains more precisely the significantly superior performance of the I-O efficient algorithms in systems that support prefetching compared to ones that do not.
Sandeep SenEmail:

We give anO(log4 n)-timeO(n 2)-processor CRCW PRAM algorithm to find a hamiltonian cycle in a strong semicomplete bipartite digraph,B, provided that a factor ofB (i.e., a collection of vertex disjoint cycles covering the vertex set ofB) is computed in a preprocessing step. The factor is found (if it exists) using a bipartite matching algorithm, hence placing the whole algorithm in the class Random-NC. We show that any parallel algorithm which can check the existence of a hamiltonian cycle in a strong semicomplete bipartite digraph in timeO(r(n)) usingp(n) processors can be used to check the existence of a perfect matching in a bipartite graph in timeO(r(n)+n 2 /p(n)) usingp(n) processors. Hence, our problem belongs to the class NC if and only if perfect matching in bipartite graphs belongs to NC. We also consider the problem of finding a hamiltonian path in a semicomplete bipartite digraph.  相似文献   

This paper presents a novel pipelined architecture for competitive learning (CL). The architecture is implemented by the field programmable gate array (FPGA). It is used as a hardware accelerator in a system on programmable chip (SOPC) for reducing the computation time. In the architecture, a novel codeword swapping scheme is adopted so that neuron competitions for different training vectors can be operated concurrently. The neuron updating process is based on a hardware divider with simple table lookup operations. The divider performs finite precision calculations for area cost reduction at the expense of slight degradation in training performance. The CPU time of the NIOS processor executing the CL training with the proposed architecture as an accelerator is measured. Experimental results show that the NIOS processor with the proposed architecture as an accelerator can achieve up to a speedup of 254 over its software counterpart running on a general purpose processor Pentium IV without hardware support.  相似文献   

