BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Computational Neuroscience
SUMMARY:The K-FAC method for neural network optimization -
James Martens\, Google Deep Mind
DTSTART;TZID=Europe/London:20190314T140000
DTEND;TZID=Europe/London:20190314T150000
UID:TALK121438AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/121438
DESCRIPTION:Second order optimization methods have the potenti
al to be much faster than first order methods in t
he deterministic case\, or pre-asymptotically in t
he stochastic case. However traditional second ord
er methods have proven ineffective or impractical
for neural network training\, due in part to the e
xtremely high dimension of the parameter space. Kr
onecker-factored Approximate Curvature (K-FAC) is
second-order optimization method based on a tracta
ble approximation to the Gauss-Newton/Fisher matri
x that exploits the special structure of neural ne
twork training objectives. This approximation is
neither low-rank nor diagonal\, but instead involv
es Kronecker-products\, which allows for efficient
estimation\, storage and inversion of the curvatu
re matrix. In this talk I will introduce the basic
K-FAC method for standard MLPs and then present s
ome more recent work in this direction\, including
extensions to CNNs and RNNs\, both of which requi
res new approximations to the Fisher. For these I
will provide theoretically motivated arguments\,
as well as empirical results which speak to their
efficacy in neural network optimization.
LOCATION:Cambridge University Engineering Department\, CBL\
, BE4-38 (http://learning.eng.cam.ac.uk/Public/Dir
ections)
CONTACT:Alberto Bernacchia
END:VEVENT
END:VCALENDAR