Monte Carlo TD(λ)-methods for the optimal control of discrete-time Markovian jump linear systems |
| |
Authors: | Oswaldo LV Costa Julio CC Aya |
| |
Affiliation: | Departamento de Engenharia de Telecomunicações e Controle, Escola Politécnica da Universidade de São Paulo, CEP: 05508 900 São Paulo SP Brazil |
| |
Abstract: | In this paper, we present an iterative technique based on Monte Carlo simulations for deriving the optimal control of the infinite horizon linear regulator problem of discrete-time Markovian jump linear systems for the case in which the transition probability matrix of the Markov chain is not known. We trace a parallel with the theory of TD(λ) algorithms for Markovian decision processes to develop a TD(λ) like algorithm for the optimal control associated to the maximal solution of a set of coupled algebraic Riccati equations (CARE). It is assumed that either there is a sample of past observations of the Markov chain that can be used for the iterative algorithm, or it can be generated through a computer program. Our proofs rely on the spectral radius of the closed loop operators associated to the mean square stability of the system being less than 1. |
| |
Keywords: | TD(λ) methods Jump systems Markov parameters Optimal control Monte Carlo simulations |
本文献已被 ScienceDirect 等数据库收录! |