BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Natural gradient in deep neural networks - Alberto Bernacchia (Uni
 versity of Cambridge)
DTSTART:20181121T140000Z
DTEND:20181121T153000Z
UID:TALK115237@talks.cam.ac.uk
CONTACT:Robert Pinsler
DESCRIPTION:We introduce the natural gradient method for stochastic optimi
 zation\, and discuss whether and how this method could be applied to deep 
 neural networks. We motivate the natural gradient by showing that the perf
 ormance of stochastic gradient descent depends heavily on the choice of pa
 rameters\, and it does not take into account the information geometry of t
 he model. We show that this geometry is described by the Fisher informatio
 n metric\, and the steepest descent in the loss function is realized by th
 e natural gradient\, which is invariant to changes in parameters. We conne
 ct natural gradient with second-order optimization methods and discuss pos
 sible applications to deep neural networks. In particular\, we present K-F
 AC\, a specific method based on approximating the inverse Fisher informati
 on matrix as Kronecker-factorized blocks and independent layers. This allo
 ws connecting a variety of different methods under a unified framework (e.
 g. adaptive gradients\, batch normalization\, whitening). We describe appl
 ications of K-FAC to both standard and convolutional neural networks\, and
  compare with state-of-art methods.
LOCATION:Engineering Department\, CBL Room 438
END:VEVENT
END:VCALENDAR