Follow the perturbed approximate leader for solving semi-bandit combinatorial optimization期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Follow the perturbed approximate leader for solving semi-bandit combinatorial optimization

Authors:	Feidiao YANG Wei CHEN Jialin ZHANG Xiaoming SUN

Affiliation:	¹. Institute of Computing Technology, Chinese Academy of Sciences, Bejing 100190, China². University of Chinese Academy of Sciences, Beijing 100190, China³. Microsoft Research Asia, Beijing 100080, China

Abstract:	Combinatorial optimization in the face of uncertainty is a challenge in both operational research and machine learning. In this paper, we consider a special and important class called the adversarial online combinatorial optimization with semi-bandit feedback, in which a player makes combinatorial decisions and gets the corresponding feedback repeatedly. While existing algorithms focus on the regret guarantee or assume there exists an efficient offline oracle, it is still a challenge to solve this problem efficiently if the offline counterpart is NP-hard. In this paper, we propose a variant of the Followthe-Perturbed-Leader (FPL) algorithm to solve this problem. Unlike the existing FPL approach, our method employs an approximation algorithm as an offline oracle and perturbs the collected data by adding nonnegative random variables. Our approach is simple and computationally efficient. Moreover, it can guarantee a sublinear (1+ ε)-scaled regret of order O( $T 23$ ) for any small ε>0 for an important class of combinatorial optimization problems that admit an FPTAS (fully polynomial time approximation scheme), in which Tis the number of rounds of the learning process. In addition to the theoretical analysis, we also conduct a series of experiments to demonstrate the performance of our algorithm.

Keywords:	online learning online combinatorial optimization semi-bandit follow-the-perturbed-leader

	点击此处可从《Frontiers of Computer Science》浏览原始摘要信息
	点击此处可从《Frontiers of Computer Science》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏