Scaling Deep Learning
- đ¤ Speaker: Misha Denil, University of Oxford
- đ Date & Time: Friday 21 March 2014, 10:00 - 11:00
- đ Venue: Small Lecture Theatre, Microsoft Research Ltd, 21 Station Road, Cambridge, CB1 2FB
Abstract
Deep learning has seen an explosion of interest the the past few years due to some high profile success stories. On a single machine GPUs are ideally suited to the computations required to train neural networks, and have become the standard tool to accomplish this. However, scaling neural networks beyond a single machine remains difficult, as the cost of synchronizing weight updates quickly dominates the cost of computation.
I will present two different approaches to dealing with communication overhead. The first approach is based on the observation that filters in a neural network tend to be smooth, and that the standard parameterization is wasteful in this case. I will show how we can take advantage of this structure to re-parameterize a neural network to reduce the communication overhead if the model is distributed over many machines. The second approach is a re-imagining of the standard neural network training algorithm based on some recent advances in distributed parameter estimation in MRFs.
Series This talk is part of the Microsoft Research Cambridge, public talks series.
Included in Lists
- All Talks (aka the CURE list)
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- Guy Emerson's list
- Interested Talks
- Microsoft Research Cambridge, public talks
- ndk22's list
- ob366-ai4er
- Optics for the Cloud
- personal list
- PMRFPS's
- rp587
- School of Technology
- Small Lecture Theatre, Microsoft Research Ltd, 21 Station Road, Cambridge, CB1 2FB
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Misha Denil, University of Oxford
Friday 21 March 2014, 10:00-11:00