COOKIES: By using this website you agree that we can place Google Analytics Cookies on your device for performance monitoring. |

University of Cambridge > Talks.cam > CUED Control Group Seminars > Q-learning and Pontryagin's Minimum Principle

## Q-learning and Pontryagin's Minimum PrincipleAdd to your list(s) Download to your calendar using vCal - Professor Sean Meyn (Director, Decision & Control Lab, CSL ECE UIUC)
- Wednesday 13 January 2010, 14:00-15:00
- Cambridge University Engineering Department, Lecture Theatre 6.
If you have a question about this talk, please contact Dr Ioannis Lestas.
Q-learning is a technique used to compute an optimal policy for a controlled Markov chain based on observations of the system controlled using a non-optimal policy. It has proven to be effective for models with finite state and action space. This paper establishes connections between Q-learning and nonlinear control of continuous-time models with general state space and general action space. The main contributions are summarized as follows. - The starting point is the observation that the “Q-function” appearing in Q-learning algorithms is an extension of the Hamiltonian that appears in the Minimum Principle. Based on this observation we introduce the steepest descent Q-learning (SDQ-learning) algorithm to obtain the optimal approximation of the Hamiltonian within a prescribed finite-dimensional function class.
- A transformation of the optimality equations is performed based on the adjoint of a resolvent operator. This is used to construct a consistent algorithm based on stochastic approximation that requires only causal filtering of the time-series data.
- Several examples are presented to illustrate the application of these techniques, including application to distributed control of multi-agent systems.
This talk is part of the CUED Control Group Seminars series. ## This talk is included in these lists:- All Talks (aka the CURE list)
- CUED Control Group Seminars
- Cambridge Big Data
- Cambridge University Engineering Department Talks
- Cambridge University Engineering Department, Lecture Theatre 6
- Centre for Smart Infrastructure & Construction
- Featured lists
- Information Engineering Division seminar list
- School of Technology
- Signal Processing and Communications Lab Seminars
- ndk22's list
- rp587
Note that ex-directory lists are not shown. |
## Other listsRethinking Life Directions in Research Talks CUCS## Other talksThe role of myosin VI in connexin 43 gap junction accretion Exploratory Data Analysis Stimulus effects dwarf task effects in visual regions The artificial leaf for efficient generation of solar fuels: involved elementary steps and material's design Microtubule Modulation of Myocyte Mechanics Imaging Clinic |