Lower Bounds and Selectivity of Weak-Consistent Policies in Stochastic Multi-Armed Bandit Problem.
El Alaoui, Issam; Audibert, Jean-Yves; Salomon, Antoine (2013), Lower Bounds and Selectivity of Weak-Consistent Policies in Stochastic Multi-Armed Bandit Problem., Journal of Machine Learning Research, 14, 1, p. 187-207
TypeArticle accepté pour publication ou publié
Journal nameJournal of Machine Learning Research
MetadataShow full item record
Abstract (EN)This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit. A well-known result of Lai and Robbins, which has then been extended by Burnetas and Katehakis, has established the presence of a logarithmic bound for all consistent policies. We relax the notion of consistency, and exhibit a generalisation of the bound. We also study the existence of logarithmic bounds in general and in the case of Hannan consistency. Moreover, we prove that it is impossible to design an adaptive policy that would select the best of two algorithms by taking advantage of the properties of the environment. To get these results, we study variants of popular Upper Confidence Bounds (UCB) policies.
Subjects / KeywordsConsistency; regret lower bounds; selectivity; stochastic bandits; UCB policies
Showing items related by title and author.
Salomon, Antoine; Audibert, Jean-Yves (2014) Article accepté pour publication ou publié
Sustainable Management of the Upper Rhine River and Its Alluvial Plain: Lessons from Interdisciplinary Research in France and Germany Schmitt, Laurent; Beisel, Jean-Nicolas; Preusser, Franck; De Jong, Carmen; Wantzen, Karl Matthias; Chardon, Valentin; Staentzel, Cybill; Eschbach, David; Damm, Christian; Rixhon, Gilles; Salomon, Ferréol; Glaser, Rüdiger; Himmelsbach, Iso; Meinard, Yves; Dumont, Serge; Hardion, Laurent; Houssier, Jérôme; Rambeau, Claire; Chapkanski, Stoil; Brackhane, Sébastien (2020) Chapitre d'ouvrage
Explicit lower bounds for the cost of fast controls for some 1-D parabolic or dispersive equations, and a new lower bound concerning the uniform controllability of the 1-D transport–diffusion equation Lissy, Pierre (2015) Article accepté pour publication ou publié