• xmlui.mirage2.page-structure.header.title
    • français
    • English
  • Help
  • Login
  • Language 
    • Français
    • English
View Item 
  •   BIRD Home
  • LEDa (UMR CNRS 8007, UMR IRD 260)
  • LEDa : Publications
  • View Item
  •   BIRD Home
  • LEDa (UMR CNRS 8007, UMR IRD 260)
  • LEDa : Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesTypeThis CollectionBy Issue DateAuthorsTitlesType

My Account

LoginRegister

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors
Thumbnail

Lower Bounds and Selectivity of Weak-Consistent Policies in Stochastic Multi-Armed Bandit Problem.

El Alaoui, Issam; Audibert, Jean-Yves; Salomon, Antoine (2013), Lower Bounds and Selectivity of Weak-Consistent Policies in Stochastic Multi-Armed Bandit Problem., Journal of Machine Learning Research, 14, 1, p. 187-207

View/Open
salomon13a.pdf (195.5Kb)
Type
Article accepté pour publication ou publié
Date
2013-01
Journal name
Journal of Machine Learning Research
Volume
14
Number
1
Publisher
MIT Press
Pages
187-207
Metadata
Show full item record
Author(s)
El Alaoui, Issam
Audibert, Jean-Yves
Salomon, Antoine
Abstract (EN)
This paper is devoted to regret lower bounds in the classical model of stochastic multi-armed bandit. A well-known result of Lai and Robbins, which has then been extended by Burnetas and Katehakis, has established the presence of a logarithmic bound for all consistent policies. We relax the notion of consistency, and exhibit a generalisation of the bound. We also study the existence of logarithmic bounds in general and in the case of Hannan consistency. Moreover, we prove that it is impossible to design an adaptive policy that would select the best of two algorithms by taking advantage of the properties of the environment. To get these results, we study variants of popular Upper Confidence Bounds (UCB) policies.
Subjects / Keywords
Consistency; regret lower bounds; selectivity; stochastic bandits; UCB policies
JEL
D81 - Criteria for Decision-Making under Risk and Uncertainty
C73 - Stochastic and Dynamic Games; Evolutionary Games; Repeated Games

Related items

Showing items related by title and author.

  • Thumbnail
    Robustness of stochastic bandit policies 
    Salomon, Antoine; Audibert, Jean-Yves (2014) Article accepté pour publication ou publié
  • Thumbnail
    Local Time Policies in Europe 
    Boulin, Jean-Yves (2017-06) Communication / Conférence
  • Thumbnail
    Local Time Policies in Europe 
    Boulin, Jean-Yves (2006) Chapitre d'ouvrage
  • Thumbnail
    Sustainable Management of the Upper Rhine River and Its Alluvial Plain: Lessons from Interdisciplinary Research in France and Germany 
    Schmitt, Laurent; Beisel, Jean-Nicolas; Preusser, Franck; De Jong, Carmen; Wantzen, Karl Matthias; Chardon, Valentin; Staentzel, Cybill; Eschbach, David; Damm, Christian; Rixhon, Gilles; Salomon, Ferréol; Glaser, Rüdiger; Himmelsbach, Iso; Meinard, Yves; Dumont, Serge; Hardion, Laurent; Houssier, Jérôme; Rambeau, Claire; Chapkanski, Stoil; Brackhane, Sébastien (2020) Chapitre d'ouvrage
  • Thumbnail
    Explicit lower bounds for the cost of fast controls for some 1-D parabolic or dispersive equations, and a new lower bound concerning the uniform controllability of the 1-D transport–diffusion equation 
    Lissy, Pierre (2015) Article accepté pour publication ou publié
Dauphine PSL Bibliothèque logo
Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16
Phone: 01 44 05 40 94
Contact
Dauphine PSL logoEQUIS logoCreative Commons logo