BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Candidates vs. Noises Estimation for Large Multi-Class Classificat
 ion Problem - Tong Zhang (Rutgers\, The State University of New Jersey)
DTSTART:20180628T080000Z
DTEND:20180628T084500Z
UID:TALK107479@talks.cam.ac.uk
CONTACT:INI IT
DESCRIPTION:In practice\, there has been sigificant interest in multi-clas
 s classification problems where the number of classes is large. Computatio
 nally such applications require statistical methods with run time sublinea
 r in the number of classes. A number of methods such as Noise-Contrastive 
 Estimation (NCE) and variations have been proposed in recent years to addr
 ess this problem. However\, the existing methods are not statistically eff
 icient compared to multi-class logistic regression\, which is the maximum 
 likelihood estimate. In this talk\, I will describe a new method called Ca
 ndidate v.s. Noises Estimation (CANE) that selects a small subset of candi
 date classes and samples the remaining classes. We show that CANE is alway
 s consistent and computationally efficient. Moreover\, the resulting estim
 ator has low statistical variance approaching that of the maximum likeliho
 od estimator\, when the observed label belongs to the selected candidates 
 with high probability. Extensive experimental results show that CANE achie
 ves better prediction accuracy over  a number of the state-of-the-art tree
  classifiers\, while it gains significant speedup compared to standard mul
 ti-class logistic regression.
LOCATION:Seminar Room 1\, Newton Institute
END:VEVENT
END:VCALENDAR
