• français
    • English
  • English 
    • français
    • English
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.
BIRD Home

Browse

This CollectionBy Issue DateAuthorsTitlesSubjectsJournals BIRDResearch centres & CollectionsBy Issue DateAuthorsTitlesSubjectsJournals

My Account

Login

Statistics

View Usage Statistics

Depth-Adaptive Neural Networks from the Optimal Control viewpoint

Thumbnail
View/Open
AM2020.pdf (1.440Mb)
Date
2020
Collection title
Cahier de recherche CEREMADE
Link to item file
https://hal.archives-ouvertes.fr/hal-02897466
Dewey
Analyse
Sujet
Neural Networks; Deep Learning; Continuous-Depth Neural Networks; Optimal Control
URI
https://basepub.dauphine.fr/handle/123456789/21136
Collections
  • CEREMADE : Publications
Metadata
Show full item record
Author
Aghili, Joubine
60 CEntre de REcherches en MAthématiques de la DEcision [CEREMADE]
Mula, Olga
25 Laboratoire Jacques-Louis Lions [LJLL]
60 CEntre de REcherches en MAthématiques de la DEcision [CEREMADE]
Type
Document de travail / Working paper
Item number of pages
40
Abstract (EN)
In recent years, deep learning has been connected with optimal control as a way to define a notion of a continuous underlying learning problem. In this view, neural networks can be interpreted as a discretization of a parametric Ordinary Differential Equation which, in the limit, defines a continuous-depth neural network. The learning task then consists in finding the best ODE parameters for the problem under consideration, and their number increases with the accuracy of the time discretization. Although important steps have been taken to realize the advantages of such continuous formulations, most current learning techniques fix a discretization (i.e. the number of layers is fixed). In this work, we propose an iterative adaptive algorithm where we progressively refine the time discretization (i.e. we increase the number of layers). Provided that certain tolerances are met across the iterations, we prove that the strategy converges to the underlying continuous problem. One salient advantage of such a shallow-to-deep approach is that it helps to benefit in practice from the higher approximation properties of deep networks by mitigating over-parametrization issues. The performance of the approach is illustrated in several numerical examples.

  • Accueil Bibliothèque
  • Site de l'Université Paris-Dauphine
  • Contact
SCD Paris Dauphine - Place du Maréchal de Lattre de Tassigny 75775 Paris Cedex 16

 Content on this site is licensed under a Creative Commons 2.0 France (CC BY-NC-ND 2.0) license.