dc.contributor.author Wintenberger, Olivier dc.contributor.author Alquier, Pierre dc.date.accessioned 2012-02-28T13:35:49Z dc.date.available 2012-02-28T13:35:49Z dc.date.issued 2012 dc.identifier.uri https://basepub.dauphine.fr/handle/123456789/8309 dc.language.iso en en dc.subject PAC-Bayesian bounds en dc.subject Mixing processes en dc.subject Fast rates Sparsity en dc.subject Oracle inequalities en dc.subject Time series prediction en dc.subject Statistical learning theory en dc.subject.ddc 519 en dc.title Fast rates in learning with dependent observations en dc.type Document de travail / Working paper dc.contributor.editoruniversityother Centre de Recherche en Économie et Statistique (CREST) http://www.crest.fr/ INSEE – École Nationale de la Statistique et de l'Administration Économique;France dc.contributor.editoruniversityother Laboratoire de Probabilités et Modèles Aléatoires (LPMA) http://www.proba.jussieu.fr/ CNRS : UMR7599 – Université Paris VI - Pierre et Marie Curie – Université Paris VII - Paris Diderot;France dc.description.abstracten In this paper we tackle the problem of fast rates in time series forecasting from a statistical learning perspective. In a serie of papers (e.g. Meir 2000, Modha and Masry 1998, Alquier and Wintenberger 2012) it is shown that the main tools used in learning theory with iid observations can be extended to the prediction of time series. The main message of these papers is that, given a family of predictors, we are able to build a new predictor that predicts the series as well as the best predictor in the family, up to a remainder of order $1/\sqrt{n}$. It is known that this rate cannot be improved in general. In this paper, we show that in the particular case of the least square loss, and under a strong assumption on the time series (phi-mixing) the remainder is actually of order $1/n$. Thus, the optimal rate for iid variables, see e.g. Tsybakov 2003, and individual sequences, see \cite{lugosi} is, for the first time, achieved for uniformly mixing processes. We also show that our method is optimal for aggregating sparse linear combinations of predictors. en dc.publisher.name Université Paris-Dauphine en dc.publisher.city Paris en dc.identifier.citationpages 15 en dc.identifier.urlsite http://hal.archives-ouvertes.fr/hal-00671979 en dc.description.sponsorshipprivate oui en dc.subject.ddclabel Probabilités et mathématiques appliquées en
