首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this paper, the problem of caching continuous media data in a (main) memory and disk caching system is addressed. Caching schemes can significantly reduce the load on the network as well as on the servers, also the retrieval of documents from the cache requires short response time. In interval-level caching algorithms, an interval of data between two adjacent streams is the basic caching entity. In this paper, we design a novel algorithm, referred to as variable bit rate caching (VBRC) algorithm, which belongs to the interval-level caching algorithms. The proposed VBRC algorithm can be used in the system for memory caching or disk caching. VBRC can handle variable retrieval bandwidth as well as constant retrieval bandwidth . In designing the VBRC algorithm, we propose the strategies of reducing the number of switching operation, which will probably cause discontinuity of retrieving data. Also, we propose a just-in-time scheme for resource allocation in our VBRC algorithm and show that the caching performance in comparison with the reservation scheme adopted in the resource-based caching (RBC) algorithm is significantly improved. Our simulation study compares the recent and most popular generalized interval caching, RBC, and VBRC, on several influencing factors such as cache space size, cache I/O bandwidth, request arrival rate, and percentage of requests for large documents, with respect to the byte hit ratio and the number of switching operations. The simulation result confirms our analysis.
Bharadwaj VeeravalliEmail: URL: http://cnds.ece.nus.edu.sg
  相似文献   

2.
Andrews  Bender  Zhang 《Algorithmica》2008,32(2):277-301
Abstract. Processor speed and memory capacity are increasing several times faster than disk speed. This disparity suggests that disk I/ O performance could become an important bottleneck. Methods are needed for using disks more efficiently. Past analysis of disk scheduling algorithms has largely been experimental and little attempt has been made to develop algorithms with provable performance guarantees. We consider the following disk scheduling problem. Given a set of requests on a computer disk and a convex reachability function that determines how fast the disk head travels between tracks, our goal is to schedule the disk head so that it services all the requests in the shortest time possible. We present a 3/2 -approximation algorithm (with a constant additive term). For the special case in which the reachability function is linear we present an optimal polynomial-time solution. The disk scheduling problem is related to the special case of the Asymmetric Traveling Salesman Problem with the triangle inequality (ATSP-Δ ) in which all distances are either 0 or some constant α . We show how to find the optimal tour in polynomial time and describe how this gives another approximation algorithm for the disk scheduling problem. Finally we consider the on-line version of the problem in which uniformly distributed requests arrive over time. We present an algorithm related to the above ATSP-Δ .  相似文献   

3.
目的 视觉检索需要准确、高效地从大型图像或者视频数据集中检索出最相关的视觉内容,但是由于数据集中图像数据量大、特征维度高的特点,现有方法很难同时保证快速的检索速度和较好的检索效果。方法 对于面向图像视频数据的高维数据视觉检索任务,提出加权语义局部敏感哈希算法(weighted semantic locality-sensitive hashing, WSLSH)。该算法利用两层视觉词典对参考特征空间进行二次空间划分,在每个子空间里使用加权语义局部敏感哈希对特征进行精确索引。其次,设计动态变长哈希码,在保证检索性能的基础上减少哈希表数量。此外,针对局部敏感哈希(locality sensitive hashing, LSH)的随机不稳定性,在LSH函数中加入反映参考特征空间语义的统计性数据,设计了一个简单投影语义哈希函数以确保算法检索性能的稳定性。结果 在Holidays、Oxford5k和DataSetB数据集上的实验表明,WSLSH在DataSetB上取得最短平均检索时间0.034 25 s;在编码长度为64位的情况下,WSLSH算法在3个数据集上的平均精确度均值(mean average precision,mAP)分别提高了1.2%32.6%、1.7%19.1%和2.6%28.6%,与几种较新的无监督哈希方法相比有一定的优势。结论 通过进行二次空间划分、对参考特征的哈希索引次数进行加权、动态使用变长哈希码以及提出简单投影语义哈希函数来对LSH算法进行改进。由此提出的加权语义局部敏感哈希(WSLSH)算法相比现有工作有更快的检索速度,同时,在长编码的情况下,取得了更为优异的性能。  相似文献   

4.
We present a high-level enterprise system architecture that closely models the domain ontology of resource and information flows in enterprises. It is:Process-orientedformal, user-definable specifications for the expected exchange of resources (money, goods, and services), notably contracts, are represented explicitly in the system state to reflect expectations on future events;Event-drivenevents denote relevant information about real-world transactions, specifically the transfer of resources and information between economic agents, to which the system reacts by matching against its portfolio of running processes/contracts in real time;Declarativeuser defined reporting functions can be formulated as declarative functions on the system state, including the representations of residual contractual obligations.We introduce the architecture and demonstrate how analyses of the standard reporting requirements for companies—the income statement and the balance sheet—can be used to drive the design of events that need registering for such reporting purposes. We then illustrate how the multi-party obligations in trade contracts (sale, purchase), including pricing and VAT payments, can be represented as formal contract expressions that can be subjected to analysis.To the best of our knowledge this is the first architecture for enterprise resource accounting that demonstrably maps high-level process and information requirements directly to executable specifications.  相似文献   

5.
Edmonds  Pruhs 《Algorithmica》2008,36(3):315-330
Abstract. We investigate server scheduling policies to minimize average user perceived latency in pull-based client-server systems (systems where multiple clients request data from a server) where the server answers requests on a multicast/ broadcast channel. We first show that there is no O(1) -competitive algorithm for this problem. We then give a method to convert any nonclairvoyant unicast scheduling algorithm A to nonclairvoyant multicast scheduling algorithm B . We show that if A works well, when jobs can have parallel and sequential phases, then B works well if it is given twice the resources. More formally, if A is an s -speed c -approximation unicast algorithm, then its counterpart algorithm B is a 2s -speed c -approximation multicast algorithm. It is already known [5] that Equi-partition, which devotes an equal amount of processing power to each job, is a (2 + ε) -speed O(1 + 1/ε) -approximation algorithm for unicast scheduling of such jobs. Hence, it follows that the algorithm {BEQUI}, which broadcasts all requested files at a rate proportional to the number of outstanding requests for that file, is a (4 + ε) -speed O(1 + 1/ε) -approximation algorithm. We give another algorithm BEQUI-EDF and show that BEQUI-EDF is also a (4 + ε) -speed O(1 + 1/ε) -approximation algorithm. However, BEQUI-EDF has the advantage that the maximum number of preemptions is linear in the number of requests, and the advantage that no preemptions occur if the data items have unit size.  相似文献   

6.
In this paper, we present efficient, scalable, and portable parallel algorithms for the off-line clustering, the on-line retrieval and the update phases of the Text Retrieval (TR) problem based on the vector space model and using clustering to organize and handle a dynamic document collection. The algorithms are running on the Coarse-Grained Multicomputer (CGM) and/or the Bulk Synchronous Parallel (BSP) model which are two models that capture within a few parameters the characteristics of the parallel machine. To the best of our knowledge, our parallel retrieval algorithms are the first ones analyzed under these specific parallel models. For all the phases of the proposed algorithms, we analytically determine the relevant communication and computation cost thereby formally proving the efficiency of the proposed solutions. In addition, we prove that our technique for the on-line retrieval phase performs very well in comparison to other possible alternatives in the typical case of a multiuser information retrieval (IR) system where a number of user queries are concurrently submitted to an IR system. Finally, we discuss external memory issues and show how our techniques can be adapted to the case when processors have limited main memory but sufficient disk capacity for holding their local data.
Damianos GavalasEmail:
  相似文献   

7.
Declustering is a common technique used to reduce query response times. Data is declustered over multiple disks and query retrieval can be parallelized. Most of the research on declustering is targeted at spatial range queries and investigates schemes with low additive error. Recently, declustering using replication has been proposed to reduce the additive overhead. Replication significantly reduces retrieval cost of arbitrary queries. In this paper, we propose a disk allocation and retrieval mechanism for arbitrary queries based on design theory. Using the proposed c-copy replicated declustering scheme, buckets can be retrieved using at most k disk accesses. Retrieval algorithm is very efficient and is asymptotically optimal with complexity for a query Q. In addition to the deterministic worst-case bound and efficient retrieval, proposed algorithm handles nonuniform data, high dimensions, supports incremental declustering and has good fault-tolerance property. Experimental results show the feasibility of the algorithm. Recommended by: Sunil Prabhakar  相似文献   

8.
Gross primary production (GPP) is an important variable in studies of the carbon cycle and climate change. The Moderate Resolution Imaging Spectroradiometer (MODIS)-GPP product (MOD17) provides global GPP data for terrestrial ecosystems; however, it is not well validated in China. In this study, an eddy covariance (EC) system observed GPP at 10 sites in northern China and was used to validate MOD17. The results indicated that MOD17 presents a strong bias in the study region due to the meteorological data, MODIS FPAR (fraction of absorbed photosynthetically active radiation) (MOD15), and the model parameters in the MODIS-GPP algorithm, Biome Parameters Look Up Table (BPLUT). Maximum light-use efficiency (?0) had the strongest impact on the predicted GPP of the MODIS-GPP algorithm. After using the inputs observed in situ and improving parameters in the MODIS-GPP algorithm, the model could explain 85% of the EC-observed GPP of the sites, whereas the MODIS-GPP algorithm without in situ inputs and parameters only explained 26% of EC-observed GPP.  相似文献   

9.
We study the problem of minimizing the expected cost of binary searching for data where the access cost is not fixed and depends on the last accessed element, such as data stored in magnetic or optical disk. We present an optimal algorithm for this problem that finds the optimal search strategy in O(n 3 ) time, which is the same time complexity of the simpler classical problem of fixed costs. Next, we present two practical linear expected time algorithms, under the assumption that the access cost of an element is independent of its physical position. Both practical algorithms are online, that is, they find the next element to access as the search proceeds. The first one is an approximate algorithm which minimizes the access cost disregarding the goodness of the problem partitioning. The second one is a heuristic algorithm, whose quality depends on its ability to estimate the final search cost, and therefore it can be tuned by recording statistics of previous runs. We present an application for our algorithms related to text retrieval. When a text collection is large it demands specialized indexing techniques for efficient access. One important type of index is the suffix array, where data access is provided through an indirect binary search on the text stored in magnetic disk or optical disk. Under this cost model we prove that the optimal algorithm cannot perform better than Ω(1/ log n) times the standard binary search. We also prove that the approximate strategy cannot, on average, perform worse than 39% over the optimal one. We confirm the analytical results with simulations, showing improvements between 34% (optimal) and 60% (online) over standard binary search for both magnetic and optical disks. Received February 13, 1997; revised May 27, 1998.  相似文献   

10.
I/O scheduling for digital continuous media   总被引:4,自引:0,他引:4  
A growing set of applications require access to digital video and audio. In order to provide playback of such continuous media (CM), scheduling strategies for CM data servers (CMS) are necessary. In some domains, particularly defense and industrial process control, the timing requirements of these applications are strict and essential to their correct operation. In this paper we develop a scheduling strategy for multiple access to a CMS such that the timing guarantees are maintained at all times. First, we develop a scheduling strategy for the steady state, i.e., when there are no changes in playback rate or operation. We derive an optimal Batched SCAN (BSCAN) algorithm that requires minimum buffer space to schedule concurrent accesses. The scheduling strategy incorporates two key constraints: (1) data fetches from the storage system are assumed to be in integral multiples of the block size, and (2) playback guarantees are ensured for frame-oriented streams when each frame can span multiple blocks. We discuss modifications to the scheduling strategy to handle compressed data like motion-JPEG and MPEG. Second, we develop techniques to handle dynamic changes brought about by VCR-like operations executed by applications. We define a suite of primitive VCR-like operations that can be executed. We show that an unregulated change in the BSCAN schedule, in response to VCR-like operations, will affect playback guarantees. We develop two general techniques to ensure playback guarantees while responding to VCR-like operations: passive and active accumulation. Using user response time as a metric we show that active accumulation algorithms outperform passive accumulation algorithms. An optimal response-time algorithm in a class of active accumulation strategies is derived. The results presented here are validated by extensive simulation studies.  相似文献   

11.
Summary In modern imperative languages there are two commonly occurring ways to activate concurrently running tasks,splitting (cobegin...coend) andspawning. The programming language Ada makes use of both forms of task activation. We present a formal system for verifying partial correctness specifications of Ada tasks activated by spawning. The system is based upon a view of tasks as histories of events. We show how the mindset of splitting may be applicable when developing a formal system for reasoning about spawning. The resultant proof system is compositional, and a robust extension of partial correctness proof systems for sequential constructs. A transition model is given for spawning, and the proof system is proven complete in the sense of Cook [10] relative to this model, under certain reasonable assumptions. The specific proof rules given apply to a subset of Ada without real-time and distributed termination. Our approach to task verification applies to other imperative languages besides Ada, and the essential parts of our methodology are applicable to other formal systems besides those based on partial correctness reasoning. Sigurd Meldal is professor of informatics at the University of Bergen. He is interested in techniques and tools based on formal methods for development of concurrent software. His current foci are the investigation of algebraic approaches to nondeterminism, and the participation in the design of a concurrent specification, prototyping and implementation language. The latter supplements formal proof with support for run time control of consistency between concurrent systems as specified and as implemented. Meldal received his cand. real. (1982) and dr. scient. (1986) degrees in informatics from the University of Oslo.This research was supported by a grant from the Norwegian Research Council for Science and the Humanities, by the Defense Advanced Research Projects Agency/Information Systems Technology Office under the office of Naval Research contract N00014-90-J1232, by the Air Force Office of Scientific Research under Grant AFOSR83-0255 and by a Fulbright Scholarship from the US Educational Foundation in Norway  相似文献   

12.
In this paper, to model check real-time value-passing systems, a formal language Timed Symbolic Transition Graph and a logic system named Timed Predicate p-Calculus are proposed. An algorithm is presented which is local in that it generates and investigates the reachable state space in top-down fashion and maintains the partition for time evaluations as coarse as possible while on-the-fly instantiating data variables. It can deal with not only data variables with finite value domain, but also the so called data independent variables with infinite value domain. To authors knowledge, this is the first algorithm for model checking timed systems containing value-passing features.  相似文献   

13.
Video-on-demand (VOD) service requires balanced use of system resources, such as disk bandwidth and buffer, to accommodate more clients. The data retrieval size and data rates of video streams directly affect the utilization of these resources. Given the data rates which vary widely in multi-resolution video servers, we need to determine the appropriate data retrieval size to balance the buffer with the disk bandwidth. Otherwise, the server may be unable to admit new clients even though one of the resources is available for use. To address this problem, we propose the following new schemes that work together: (1) A replication scheme called Splitting Striping units by Replication (SSR). To increase the number of admitted clients, SSR defines two sizes of striping unit, which allow data to be stored on the primary and backup copies in different ways. (2) A retrieval scheduling method which combines the merits of existing SCAN and grouped sweeping scheme (GSS) algorithms to balance the buffer and disk bandwidth usage. (3) Admission control algorithms which decide whether to read data from the primary or the backup copy. The effectiveness of the proposed schemes is demonstrated through simulations. Results show that our schemes are able to cope with various workloads efficiently and thus enable the server to admit a much larger number of clients.  相似文献   

14.
We present an efficient randomized algorithm for leader election in large-scale distributed systems. The proposed algorithm is optimal in message complexity (O(n) for a set of n total processes), has round complexity logarithmic in the number of processes in the system, and provides high probabilistic guarantees on the election of a unique leader. The algorithm relies on a balls and bins abstraction and works in two phases. The main novelty of the work is in the first phase where the number of contending processes is reduced in a controlled manner. Probabilistic quorums are used to determine a winner in the second phase. We discuss, in detail, the synchronous version of the algorithm, provide extensions to an asynchronous version and examine the impact of failures.  相似文献   

15.
目的压缩感知信号重构过程是求解不定线性系统稀疏解的过程。针对不定线性系统稀疏解3种求解方法不够鲁棒的问题:最小化l0-范数属于NP问题,最小化l1-范数的无解情况以及最小化lp-范数的非凸问题,提出一种基于光滑正则凸优化的方法进行求解。方法为了获得全局最优解并保证算法的鲁棒性,首先,设计了全空间信号l0-范数凸拟合函数作为优化的目标函数;其次,将n元函数优化问题转变为n个一元函数优化问题;最后,求解过程中利用快速收缩算法进行求解,使收敛速度达到二阶收敛。结果该算法无论在仿真数据集还是在真实数据集上,都取得了优于其他3种类型算法的效果。在仿真实验中,当信号维数大于150维时,该方法重构时间为其他算法的50%左右,具有快速性;在真实数据实验中,该方法重构出的信号与原始信号差的F-范数为其他算法的70%,具有良好的鲁棒性。结论本文算法为二阶收敛的凸优化算法,可确保快速收敛到全局最优解,适合处理大型数据,在信息检索、字典学习和图像压缩等领域具有较大的潜在应用价值。  相似文献   

16.
The prediction of future events has great importance in many applications. The prediction is based on episode rules which are composed of events and two time constraints which require all the events in the episode rule and in the predicate of the rule to occur in a time interval, respectively. In an event stream, a sequence of events which matches the predicate of the rule satisfying the specified time constraint is called an occurrence of the predicate. After finding the occurrence, the consequent event which will occur in a time interval can be predicted. However, the time intervals computed from some occurrences for predicting the event can be contained in the time intervals computed from other occurrence and become redundant. As a result, how to design an efficient and effective event predictor in a stream environment is challenging. In this paper, an effective scheme is proposed to avoid matching the predicate events corresponding to redundant time intervals for prediction. Based on the scheme, we respectively consider two methodologies, forward retrieval and backward retrieval, for the efficient matching of predicate events over event streams. The approach based on forward retrieval construct a queue structure to incrementally maintain parts of the matched results as events arrive, and thus it avoids backward scans of the event stream. On the other hand, the approach based on backward retrieval maintains the recently arrived events in a tree structure. The matching of predicate events is triggered by identifiable events and achieved by an efficient retrieval on the tree structure, which avoids exhaustive scans of the arrived events. By running a series of experiments, we show that each of the proposed approaches has its advantages on particular data distributions and parameter settings.  相似文献   

17.
In this paper, we address the problem of retrieving a movie from a set of multimedia(MM) servers by the clients on a network. We consider a strategy in which multiple MM servers are deployed by the service provider (SP) to retrieve a requested MM movie to the clients, for minimizing the access time (the waiting time of the client before initiating the playback) and maximizes the system reliability. We design a movie retrieval strategy that explicitly considers issues such as reliability and/or availability factors of the multimedia servers and the communication channels in the problem formulation. We develop a mathematical model for this retrieval strategy and derive an optimal size of each movie portion that is expected to be rendered by each server. We then derive a closed-form expression for the access time of the MM document and the system reliability which gives a trade-off relationship between access time and reliability (availability) of the service by our strategy. We extend our study to investigate on the effect of sequencing of the servers, the order in which movie portions are to be retrieved, to minimize the access time and to maximize the system reliability. With system reliability factors, we identify an optimal sequence, which maximizes system reliability out of all possible retrieval sequences. We then propose two methods to retrieve any missing movie portions upon a server failure during the retrieval process. In order to measure the quality of service provided by the service provider to its customers, we introduce a QoS parameter that can tune the playback rate to avoid any data underflow or overflow situations. Then, from probabilistic perspective, we obtain an estimate of the failure time of a single server and its resulting missing movie portion caused by this server failure. We conduct rigorous simulation experiments to testify all the theoretical findings reported. Illustrative examples are provided for the ease of understanding.  相似文献   

18.
一种新的概念格并行构造算法   总被引:1,自引:0,他引:1  
概念格作为形式概念分析理论中的核心数据结构,在数据挖掘和知识发现、人工智能、信息检索、粗糙集[1]等领域得到了广泛的应用。概念格的构造在其应用过程中是一个主要问题。提出了一种基于闭包系统划分的概念格并行构造算法——Para_Prun算法,它将概念集合看作初始闭包系统,引入了子闭包系统的有效性判断,迭代生成相互独立的多个子闭包系统,然后在每个子闭包系统中独立生成概念,有效地提高了概念的求解速度。最后用实验证明了算法的正确性和有效性。  相似文献   

19.
In this paper,we present a programmable method of revising a finite clause set.We first present a procedure whose formal parameters are a consistent clause set Γand a clause A and whose output is a set of minimal subsets of Γwhich are inconsistent with A.The maximal consistent subsets can be generated from all minimal inconsistent subsets.We develop a prototype system based on the above procedure,and discuss the implementation of knowledge base maintenance.At last,we compare the approach presented in this paper with other related approaches,The main characteristic of the approach is that it can be implemented by a computer program.  相似文献   

20.
Abstract

An abstract research on self-reproduction from the viewpoint of systems theory is made, investigating the problem of how simple the combinatorial laws of formal systems can be chosen and to still ensure nontrivial self-reproduction.

We take as a base the heuristic of the theory of cellular automata in the sense of von Neumann. We operate in a formal, microscopic, submolecular world as our patterns of cells shall represent some kind of artificial molecules. Computation- and construction-universal, self-reproducing systems are regarded as artificial living beings according to the common heuristic. A simple combinatorial system M of only four very simple dynamic laws is introduced and it can be shown that even in a world governed by this system M nontrivial self-reproduction can be established, thus illuminating what simple combinatorial structures allow for the handling of such logical somewhat difficult phenomenas as self-organization, self-reproduction, etc.

To receive a model slightly more adapted to nature than the concepts of cellular automata our system M obeys the law of microscopic reversibility, allows concurrent activities, and needs no regulation by a synchronizing device.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号