BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Multi-Label Learning with Millions of Categories & Generalized Mul
 tiple Kernel Learning with a Million Kernels - Manik Varma\, Microsoft Res
 earch\, India
DTSTART:20120719T130000Z
DTEND:20120719T140000Z
UID:TALK38622@talks.cam.ac.uk
CONTACT:Microsoft Research Cambridge Talks Admins
DESCRIPTION:Multi-Label Learning with Millions of Categories\n\nOur object
 ive is to build an algorithm for classifying a data point into a set of la
 bels when the output space contains millions of categories. This is a rela
 tively novel setting in supervised learning and brings forth interesting c
 hallenges such as efficient training and prediction\, learning from only p
 ositively labeled data with missing and incorrect labels and handling labe
 l correlations. We propose a random forest based solution for jointly tack
 ling these issues. We develop a novel extension of random forests for mult
 i-label classification which can learn from positive data alone and can sc
 ale to large data sets. We generate real valued beliefs indicating the sta
 te of labels and adapt our classifier to train on these belief vectors so 
 as to compensate for missing and noisy labels. In addition\, we modify the
  random forest cost function to avoid overfitting in high dimensional feat
 ure spaces and learn short\, balanced trees. Finally\, we write highly eff
 icient  training routines which let us train on problems with more than a 
 hundred million data points\, over a million dimensional sparse feature ve
 ctor and over ten million categories. Extensive experiments reveal that ou
 r proposed solution is not only significantly better than other multi-labe
 l classification algorithms but also more than 10\\% better than the state
 -of-the-art NLP based techniques for suggesting bid phrases for online sea
 rch advertisers.\n\nGeneralized Multiple Kernel Learning with a Million Ke
 rnels\n\nMultiple Kernel Learning (MKL) aims to learn the kernel in an SVM
  from training data. Many MKL formulations have been proposed and some hav
 e proved effective in certain applications. Nevertheless\, as MKL is a nas
 cent field\, many more formulations need to be developed to generalize acr
 oss domains and meet the challenges of real world applications. However\, 
 each MKL formulation typically necessitates the development of a specializ
 ed optimization algorithm. The lack of an efficient\, general purpose opti
 mizer capable of handling a wide range of formulations presents a signific
 ant challenge to those looking to take MKL out of the lab and into the rea
 l world. \n\nThis problem was somewhat alleviated by the development of th
 e Generalized Multiple Kernel Learning (GMKL) formulation which admits fai
 rly general kernel parameterizations and regularizers subject to mild cons
 traints. However\, the projected gradient descent GMKL optimizer is ineffi
 cient as the computation of the step size and a reasonably accurate object
 ive function value or gradient direction are all expensive. We overcome th
 ese limitations by developing a Spectral Projected Gradient (SPG) descent 
 optimizer which: a) takes into account second order information in selecti
 ng step sizes\; b) employs a non-monotone step size selection criterion re
 quiring fewer function evaluations\; c) is robust to gradient noise\, and 
 d) can take quick steps when far away from the optimum.\n\nWe show that ou
 r proposed SPG-GMKL optimizer can be an order of magnitude faster than pro
 jected gradient descent on even small and medium sized datasets. In some c
 ases\, SPG-GMKL can even outperform state-of-the-art specialized optimizat
 ion algorithms developed for a single MKL formulation. Furthermore\, we de
 monstrate that SPG-GMKL can scale well beyond gradient descent to large pr
 oblems involving a million kernels or half a million data points. Our code
  and implementation are available publically.
LOCATION:Large public lecture room\, Microsoft Research Ltd\, 7 J J Thomso
 n Avenue (Off Madingley Road)\, Cambridge
END:VEVENT
END:VCALENDAR