BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Machine Learning @ CUED
SUMMARY:Convex and non-convex worlds in machine learning -
Anna Choromanska (New York University)
DTSTART;TZID=Europe/London:20150701T110000
DTEND;TZID=Europe/London:20150701T120000
UID:TALK59569AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/59569
DESCRIPTION:Title: Convex and non-convex worlds in machine lea
rning\n \nAbstract:\n \nThe talk will focus on the
modern challenges in machine learning: designing
good and efficient problem-specific solvers\, desi
gning good problem-specific objectives and buildin
g understanding of non-convex deep learning optimi
zation. In machine learning there is a plethora of
approaches when convexity is desired to solve a g
iven problem due to the existence of unique global
minimum. Convex problems give rise to theoretical
guarantees and can typically be efficiently solve
d. In the first part of the talk an example of rec
ently developed convex approach will be discussed\
, which come with strong theoretical guarantees\,
where learning is done via reduction to convex pro
blem. First\, we show the construction of a new so
lver for the partition function-based optimization
which reduces the problem to quadratic optimizati
on. Various applications of this variational bound
will be discussed. The experimental results will
show advantages of the proposed method over state-
of-the-art optimization techniques and furthermore
will run counter to the conventional wisdom that
machine learning problems are best handled via gen
eric optimization tools. The next part of the talk
will extend the previous setting by showing how t
o use efficient solvers to more general class of p
roblems. The talk will focus on the multi-class se
tting. A reduction of this problem to a set of bin
ary classification problems organized in a tree st
ructure will be discussed and a new top-down crite
rion for purification of labels will be presented
which guarantees train and test running times that
are logarithmic in the label complexity. \n \nDis
cussed approaches either live in the world of conv
ex optimization and/or come with theoretical guara
ntees. Despite the success of convex methods\, dee
p learning methods\, where the objective is inhere
ntly highly non-convex\, have enjoyed a resurgence
of interest in the last few years and they achiev
e state-of-the-art performance. In the last part o
f the talk we move to the world of non-convex opti
mization where recent findings suggest that we mig
ht eventually be able to describe these approaches
theoretically. The connection between the highly
non-convex loss function of a simple model of the
fully-connected feed-forward neural network and th
e Hamiltonian of the spherical spin-glass model wi
ll be established. It will be shown that i) for la
rge-size networks\, most local minima are equivale
nt and yield similar performance on a test set\, (
ii) the probability of finding a “bad” (high value
) local minimum is non-zero for small-size network
s and decreases quickly with network size\, (iii)
struggling to find the global minimum on the train
ing set (as opposed to one of the many good local
ones) is not useful in practice and may lead to ov
erfitting.\n\n \nBio: \n \nAnna Choromanska is a P
ost-Doctoral Associate in the Computer Science Dep
artment at Courant Institute of Mathematical Scien
ces\, New York University. She is working in the C
omputational and Biological Learning Lab\, which i
s a part of Computational Intelligence\, Learning\
, Vision\, and Robotics Lab\, of prof. Yann LeCun.
She graduated with her PhD from Columbia Universi
ty\, Department of Electrical Engineering\, where
she was the The Fu Foundation School of Engineerin
g and Applied Science Presidential Fellowship hold
er. She was advised by prof. Tony Jebara. She comp
leted her MSc with distinctions in the Department
of Electronics and Information Technology\, Warsaw
University of Technology with double specializati
on\, Electronics and Computer Engineering and Elec
tronics and Informatics in Medicine. She was worki
ng with various industrial institutions\, includin
g AT&T Shannon Research Laboratories\, IBM T.J. Wa
tson Reseatch Center and Microsoft Research New Yo
rk. Her research interests are in machine learnin
g\, optimization and statistics with applications
in biomedicine and neurobiology. She also holds a
music degree from Mieczyslaw Karlowicz Music Schoo
l in Warsaw\, Department of Piano Play. She is an
avid salsa dancer performing with the Ache Perform
ance Group. Her other hobbies is painting and phot
ography.\n
LOCATION:Engineering Department\, CBL Room BE-438
CONTACT:Dr Jes Frellsen
END:VEVENT
END:VCALENDAR