University of Cambridge > Talks.cam > Machine Learning Reading Group @ CUED > Stochastic optimization and adaptive learning rates

Stochastic optimization and adaptive learning rates

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Yingzhen Li.

Stochastic optimization is prevalent in modern machine learning, and the main purpose of this talk is to understand why it works. We will first briefly recap the history of stochastic approximation methods, starting from the famous Robbins and Monro paper. Then we will introduce the cost function minimization problem in machine learning context and show you how to prove the convergence of stochastic gradient descent to a local optima. We proceed the proof in three steps: continuous gradient descent, discrete gradient descent, and stochastic gradient descent. However the conditions for learning rates presented in the proof is not necessary. So in the second part of the talk we will discuss popular adaptive learning rates, and in particular we will do a short tutorial on online learning, to give people intuitions on the regret bounds. Finally we will have a live demo session on comparing different learning rates.

This talk is part of the Machine Learning Reading Group @ CUED series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity