首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 140 毫秒
1.
The goals of perturbation analysis (PA), Markov decision processes (MDPs), and reinforcement learning (RL) are common: to make decisions to improve the system performance based on the information obtained by analyzing the current system behavior. In this paper, we study the relations among these closely related fields. We show that MDP solutions can be derived naturally from performance sensitivity analysis provided by PA. Performance potential plays an important role in both PA and MDPs; it also offers a clear intuitive interpretation for many results. Reinforcement learning, TD(), neuro-dynamic programming, etc., are efficient ways of estimating the performance potentials and related quantities based on sample paths. The sensitivity point of view of PA, MDP, and RL brings in some new insight to the area of learning and optimization. In particular, gradient-based optimization can be applied to parameterized systems with large state spaces, and gradient-based policy iteration can be applied to some nonstandard MDPs such as systems with correlated actions, etc. Potential-based on-line approaches and their advantages are also discussed.  相似文献   

2.
We present new algorithms for computing theH optimal performance for a class of single-input/single-output (SISO) infinite-dimensional systems. The algorithms here only require use of one or two fast Fourier transforms (FFT) and Cholesky decompositions; hence the algorithms are particularly simple and easy to implement. Numerical examples show that the algorithms are stable and efficient and converge rapidly. The method has wide applications including to theH optimal control of distributed parameter systems. We illustrate the technique with applications to some delay problems and a partial differential equation (PDE) model. The algorithms we present are also an attractive approach to the solution of high-order finite-dimensional models for which use of state space methods would present computational difficulties.  相似文献   

3.
We present a compositional method for deciding whether a process satisfies an assertion. Assertions are formulas in a modal -calculus, and processes are drawn from a very general process algebra inspired by CCS and CSP. Well-known operators from CCS, CSP, and other process algebras appear as derived operators. The method iscompositional in the structure of processes and works purely on the syntax of processes. It consists of applying a sequence ofreductions, each of which only takes into account the top-level operator of the process. A reduction transforms a satisfaction problem for a composite process into equivalent satisfaction problems for the immediate subcomponents. Using process variables, systems with underfined subcomponents can be defined, and given an overall requirement to the system,necessary and sufficient conditions on these subcomponents can be found. Hence the process variables make it possible to specify and reason about what are often referred to ascontexts, environments, andpartial implementations. Since reductions are algorithms that work on syntax, they can be considered as forming a bridge between traditional noncompositional model checking and compositional proof systems.  相似文献   

4.
This article investigates various local operators in a discrete (1, )-setting applied to tracking problems, a specific class of non-stationary problems. In the first instance, the influence of operator properties on the tracking performance is examined. Both the enforcement of bigger steps and, especially, directed mutations are found to increase the tracking accuracy considerably. For the examination of highly time restricted problems, a correlation between the population size and the severity of the problem dynamics is assumed. Relatively large population sizes are found to be advantageous if the number of evaluations has a big influence on the severity. All results are obtained using a fixpoint analysis of a worst-case model as well as simulations within a two-dimensional Markov model.  相似文献   

5.
This paper provides complete results on the stability behavior of a class of uncertain dynamical systems with jumping parameters and functional time-delays. The jumping parameters are modeled as a continuous-time, discrete-state Markov process. The parametric uncertainties are norm-bounded appearing in all system matrices and the delay factor depends on the mode of operation. Notions of weak and strong stochastic stability for the jumping system are developed depending on the available information using a prescribed -performance. Memoryless and delayed-state feedback are considered to guarantee the closed-loop stability. All the results are cast into linear matrix inequalities format. A numerical example is given to illustrate the developed results.  相似文献   

6.
Semi-Markov decision problems and performance sensitivity analysis   总被引:1,自引:0,他引:1  
Recent research indicates that Markov decision processes (MDPs) can be viewed from a sensitivity point of view; and the perturbation analysis (PA), MDPs, and reinforcement learning (RL) are three closely related areas in optimization of discrete-event dynamic systems that can be modeled as Markov processes. The goal of this paper is two-fold. First, we develop the PA theory for semi-Markov processes (SMPs); and then we extend the aforementioned results about the relation among PA, MDP, and RL to SMPs. In particular, we show that performance sensitivity formulas and policy iteration algorithms of semi-Markov decision processes can be derived based on the performance potential and realization matrix. Both the long-run average and discounted-cost problems are considered. This approach provides a unified framework for both problems, and the long-run average problem corresponds to the discounted factor being zero. The results indicate that performance sensitivities and optimization depend only on first-order statistics. Single sample path-based implementations are discussed.  相似文献   

7.
In this paper different algorithms are presented and evaluated for designing Virtual Private/Overlay Network (VPNs/VONs) over any network that supports resource partitioning e.g. ATM (Asynchronous Transfer Mode), MPLS (Multi Protocol Label Switching), or SDH/SONET (Synchronous Digital Hierarchy/Synchronous Optical Networking). All algorithms incorporate protection as well. The VPNs/VONs are formed by full mesh demand sets between VPN/VON endpoints. The service demands of VPNs/VONs are characterized by the bandwidth requirements of node-pairs (pipe-model).We investigated four design modes with three pro-active path based shared protection path algorithms and four heuristics to calculate the pairs of paths. The design mode determines the means of traffic concentration. The protection path algorithms use Dijkstras shortest path calculation with different edge weights. The demands are routed one-by-one, therefore the order in which they are processed matters.To eliminate this factor we used three heuristics (simulated allocation, simulated annealing, threshold accepting). We present numerical results obtained by simulation regarding the required total amount of capacity, the number of reserved edges, and the average length of paths.Péter Hegyi received MSc (2004) degree from the Budapest University of Technology and Economics, Hungary, where he is currently a PhD student at the Department of Telecommunications and Media Informatics. His research interests focus on design of intra- and inter-domain multilayer grooming networks and routing with protection. He has been involved in a few related projects (IKTA, ETIK, NOBEL).Markosz Maliosz is a researcher in the High Speed Networks Laboratory, Department of Telecommunication and Media Informatics at the Budapest University of Technology and Economics, where he received his MSc degree in Computer Science (1998). He has participated in projects concerning telecommunication services, network device control, Voice and Video over IP. His current research areas are Virtual Private Networking and traffic engineering in optical networks.Ákos Ladányi is a student at the Department of Telecommunications and Media Informatics at the Budapest University of Technology and Economics. His research interests focus on routing, network resilience, and combinatorial optimization.Tibor Cinkler has received MSc(94) and PhD(99) degrees from the Budapest University of Technology and Economics, Hungary, where he is currently Associate Professor at the Department of Telecommunications and Media Informatics. His research interests focus on routing, design, configuration, dimensioning and resilience of IP,MPLS, ATM, ngSDH and particularly of WR-DWDMbased multilayer networks. He is the author of over 60 refereed scientific publications and of 3 patents.  相似文献   

8.
9.
We answer questions about the distribution of the maximum size of queues and data structures as a function of time. The concept of maximum occurs in many issues of resource allocation. We consider several models of growth, including general birth-and-death processes, the M/G/ model, and a non-Markovian process (data structure) for processing plane-sweep information in computational geometry, called hashing with lazy deletion (HwLD). It has been shown that HwLD is optimal in terms of expected time and dynamic space; our results show that it is also optimal in terms of expectedpreallocated space, up to a constant factor.We take two independent and complementary approaches: first, in Section 2, we use a variety of algebraic and analytical techniques to derive exact formulas for the distribution of the maximum queue size in stationary birth-and-death processes and in a nonstationary model related to file histories. The formulas allow numerical evaluation and some asymptotics. In our second approach, in Section 3, we consider the M/G/ model (which includes M/M/ as a special case) and use techniques from the analysis of algorithms to get optimal big-oh bounds on the expected maximum queue size and on the expected maximum amount of storage used by HwLD in excess of the optimal amount. The techniques appear extendible to other models, such as M/M/1.Research was also done while the author was at Princeton University, supported in part by a Procter Fellowship.Research was also done while the author was on sabbatical at INRIA in Rocquencourt, France, and at Ecole Normale Supérieure in Paris, France. Support was provided in part by National Science Foundation Research Grant DCR-84-03613, by an NSF Presidential Young Investigator Award with matching funds from an IBM Faculty Development Award and an AT&T research grant, by a Guggenheim Fellowship, and by the Office of Naval Research and the Defense Advanced Research Projects Agency under Contract N00014-83-K-0146 and ARPA Order 6320, Amendment 1.  相似文献   

10.
The processes of constructing meaning in digital database environments entail a paradigm shift from previous models of audio-visual communication. Media emerging from the Electro-mechanical era (film/TV/video) present fixed spatio-temporal linearity and material conditions which objectify and render passive viewer and process. The problematic aspects of cinematic communication were addressed by Latin American filmmakers of the Third Cinema movement. Their concerns and approach presaged and assisted an understanding of the radical redefinition of audio-visual communication possible with digital databases. The conceptual and aesthetic aspirations of Third Cinema artists such as Julio Garcia Espinosa and Fernando Solanas were ultimately contradictory to linear media and find their fitting medium in digital modular construction. The materiality of database expression lacks an intrinsic temporal or spatial state and permits a more dynamic and multidirectional set of power relationships between author/s, piece, viewer/s. Other important referents for contextualising database art are modern art practitioners that rejected linear representational space and fractured the centrality of authorship. The author's own work, ...two, three, many Guevaras, an exploratory database environment, embraces the redefinition of process as artistic expression, the empowerment of interacting generative forces, and serves to illustrate the revolutionary potential of the new media.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号