BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Isaac Newton Institute Seminar Series
SUMMARY:Statistical theory for deep neural networks with R
 eLU activation function - Johannes Schmidt-hieber 
 (Universiteit Leiden)
DTSTART;TZID=Europe/London:20180321T113000
DTEND;TZID=Europe/London:20180321T123000
UID:TALK102739AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/102739
DESCRIPTION:The universal approximation theorem states that ne
 ural networks are capable of approximating any con
 tinuous function up to a small error that depends 
 on the size of the network. The expressive power o
 f a network does\, however\, not guarantee that de
 ep networks perform well on data. For that\, contr
 ol of the statistical estimation risk is needed. I
 n the talk\, we derive statistical theory for fitt
 ing deep neural networks to data generated from th
 e multivariate nonparametric regression model. It 
 is shown that estimators based on sparsely connect
 ed deep neural networks with ReLU activation funct
 ion and properly chosen network architecture achie
 ve the minimax rates of convergence (up to logarit
 hmic factors) under a general composition assumpti
 on on the regression function. The framework inclu
 des many well-studied structural constraints such 
 as (generalized) additive models. While there is a
  lot of flexibility in the network architecture\, 
 the tuning parameter is the sparsity of the n etwo
 rk. Specifically\, we consider large networks with
  number of potential parameters being much bigger 
 than the sample size. Interestingly\, the depth (n
 umber of layers) of the neural network architectur
 es plays an important role and our theory suggests
  that scaling the network depth with the logarithm
  of the sample size is natural.<br><br>Related Lin
 ks<ul><li><a target="_blank" rel="nofollow" href="
 http://www-old.newton.ac.uk/cgi/https%3A%2F%2Farxi
 v.org%2Fabs%2F1708.06633">https://arxiv.org/abs/17
 08.06633</a> - Article</li></ul>
LOCATION:Seminar Room 1\, Newton Institute
CONTACT:INI IT
END:VEVENT
END:VCALENDAR