University of Cambridge > > Microsoft Research Cambridge, public talks > Scaling Deep Learning

Scaling Deep Learning

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Microsoft Research Cambridge Talks Admins.

This event may be recorded and made available internally or externally via Microsoft will own the copyright of any recordings made. If you do not wish to have your image/voice recorded please consider this before attending

Deep learning has seen an explosion of interest the the past few years due to some high profile success stories. On a single machine GPUs are ideally suited to the computations required to train neural networks, and have become the standard tool to accomplish this. However, scaling neural networks beyond a single machine remains difficult, as the cost of synchronizing weight updates quickly dominates the cost of computation.

I will present two different approaches to dealing with communication overhead. The first approach is based on the observation that filters in a neural network tend to be smooth, and that the standard parameterization is wasteful in this case. I will show how we can take advantage of this structure to re-parameterize a neural network to reduce the communication overhead if the model is distributed over many machines. The second approach is a re-imagining of the standard neural network training algorithm based on some recent advances in distributed parameter estimation in MRFs.

This talk is part of the Microsoft Research Cambridge, public talks series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2022, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity