BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//talks.cam.ac.uk//v3//EN
BEGIN:VTIMEZONE
TZID:Europe/London
BEGIN:DAYLIGHT
TZOFFSETFROM:+0000
TZOFFSETTO:+0100
TZNAME:BST
DTSTART:19700329T010000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0100
TZOFFSETTO:+0000
TZNAME:GMT
DTSTART:19701025T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
CATEGORIES:Applied and Computational Analysis
SUMMARY:How to climb a billion dimensional hill using a co
in and a compass and count the steps before depart
ure - Peter Richtarik (University of Edinburgh)
DTSTART;TZID=Europe/London:20120503T150000
DTEND;TZID=Europe/London:20120503T160000
UID:TALK36496AThttp://talks.cam.ac.uk
URL:http://talks.cam.ac.uk/talk/index/36496
DESCRIPTION:SUBTITLE: Parallel block coordinate descent method
s for huge-scale partially separable problems\n\nW
ith growing digitization of the world it is increa
singly easier to collect monstrous amounts of data
. Often\, this data is analyzed using an optimizat
ion algorithm\, and leads to difficult huge-scale
problems with millions or billions of variables.\n
\nExisting optimization algorithms\, which are per
fectly suited for solving problems of medium size\
, such as polynomial-time interior-point methods\,
are often not useful in this new setting due to t
he bad dependence of their complexity on the probl
em dimension. Hence\, there is a pressing need to
devise and analyze new methods\, or adapt classica
l methods\, that would be capable of working in th
e emerging huge-scale setting. An entirely differe
nt approach to the scale problem is via accelerati
on of existing methods on parallel computing archi
tectures such as many-core computers and clusters
thereof\, or systems based on graphical processin
g units (GPUs).\n\nIn this talk we describe a new
method that combines the two approaches outlined a
bove. Our method has both i) a good dependence on
the problem dimension and b) is parallel in nature
\, and hence is well-suited for solving certain st
ructured optimization problems of huge dimension.
In particular\, we show that randomized block coor
dinate descent methods\, such as those developed i
n [1\, 2]\, can be accelerated by parallelization
when applied to the problem of minimizing the sum
of a partially block separable smooth convex funct
ion and a simple block separable convex function.\
n\nMany problems of current interest in diverse co
mmunities (statistics\, optimal experimental desig
n\, machine learning\, mechanical engineering)\, c
an be cast in this form\, including least-squares\
, L1 regularized least-squares (LASSO)\, group and
sparse group LASSO\, computing c and A-optimal de
signs of statistical experiments\, training (spars
e) linear support vector machines (SVM) and truss
topology design (TTD).\n\nWe describe a generic pa
rallel randomized block coordinate descent algorit
hm (PR-BCD) and several variants thereof based on
the way parallelization is performed. In all cases
we prove iteration complexity results\, i.e.\, we
give bounds on the number of iterations sufficien
t to approximately solve the problem with high pro
bability. Our results generalize the intuitive obs
ervation that in the separable case the theoretica
l speedup caused by parallelization should be equa
l to the number of processors. We show that the sp
eedup increases with the number of processors and
with the degree of partial separability of the smo
oth component of the objective function. Our analy
sis also works in the mode when the number of bloc
ks being updated at each iteration is random\, whi
ch allows for modelling situations with variable (
busy or unreliable) number of processors.\n\nWe co
nclude with some encouraging computational results
applied to huge-scale LASSO\, SVM and TTD instanc
es.\n\nAll results are based on joint work with Ma
rtin Takac (Edinburgh)\n\n\n[1] Yurii Nesterov. Ef
ficiency of coordinate descent methods on huge-sca
le optimization problems. CORE Discussion Paper #2
010/2\, February 2010.\n\n[2] Peter Richtarik and
Martin Takac. Iteration complexity of randomized b
lock-coordinate descent methods for minimizing a c
omposite function. After 1st revision in Mathemati
cal Programming A\, 2011.
LOCATION:MR14\, CMS
CONTACT:Carola-Bibiane Schoenlieb
END:VEVENT
END:VCALENDAR