A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot |
| |
Authors: | Ruben Martinez-Cantin Nando de Freitas Eric Brochu José Castellanos Arnaud Doucet |
| |
Affiliation: | 1.Institute for Systems and Robotics,Instituto Superior Técnico,Lisboa,Portugal;2.Department of Computer Science,University of British Columbia,Vancouver,Canada;3.Department of Computer Science and System Engineering,University of Zaragoza,Zaragoza,Spain |
| |
Abstract: | We address the problem of online path planning for optimal sensing with a mobile robot. The objective of the robot is to learn
the most about its pose and the environment given time constraints. We use a POMDP with a utility function that depends on
the belief state to model the finite horizon planning problem. We replan as the robot progresses throughout the environment.
The POMDP is high-dimensional, continuous, non-differentiable, nonlinear, non-Gaussian and must be solved in real-time. Most
existing techniques for stochastic planning and reinforcement learning are therefore inapplicable. To solve this extremely
complex problem, we propose a Bayesian optimization method that dynamically trades off exploration (minimizing uncertainty
in unknown parts of the policy space) and exploitation (capitalizing on the current best solution). We demonstrate our approach
with a visually-guide mobile robot. The solution proposed here is also applicable to other closely-related domains, including
active vision, sequential experimental design, dynamic sensing and calibration with mobile sensors. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|