University of Cambridge > Talks.cam > Microsoft Research Cambridge, public talks > Learning and Representing: a Jointly Optimal Approach

Learning and Representing: a Jointly Optimal Approach

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

This event may be recorded and made available internally or externally via http://research.microsoft.com. Microsoft will own the copyright of any recordings made. If you do not wish to have your image/voice recorded please consider this before attending

Data representations, and transformations of data representations, are fundamental to the design of effective machine learning methods. Previous research has established that expressing complex data objects such as documents or images as feature vectors reveals important structure, both in collections of data and in individual data items. For any particular application, however, one does not often know which features to use. Automatically discovering useful features as part of training has therefore been a long standing goal of machine learning research. Unfortunately the resulting training problem—simultaneously learning a feature representation and a data reconstruction model—has been deemed intractable in general.

I will demonstrate in this talk a fundamental reformulation of representation learning that enables the training problem to be solved both globally and efficiently, even when a feature representation and data reconstruction model are learned simultaneously. This is a major advance over the current state-of-the-art, where globally optimal representations cannot be guaranteed in general. I will show that this work has led to significant improvements in both generalization accuracy and training time over state-of-the-art methods.

In addition, we have taken a step towards scaling up the method to large datasets. Arguably, optimization underlies almost all branches in machine learning and a major difficulty is the nonsmooth objectives. I will show that a novel framework based on smoothing, pioneered by Nesterov, provably improves the convergence rates. In particular, I will show applications of our idea to optimizing multivariate performance measures and structure prediction. Empirical evaluation on some of the largest publicly available datasets from a variety of domains shows that our method learns the optimal model significantly faster than the state-of-the-art solvers. A broader application of the smoothing technique includes graphical model inference and compressive sensing.

This talk is part of the Microsoft Research Cambridge, public talks series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2019 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity