Reinforcement learning with a corrupted reward function
- π€ Speaker: Tom McGrath, Imperial College London
- π Date & Time: Wednesday 29 November 2017, 17:00 - 18:30
- π Venue: Cambridge University Engineering Department, CBL Seminar room BE4-38. For directions see http://learning.eng.cam.ac.uk/Public/Directions
Abstract
No real-world reward function is perfect. Sensory errors and software bugs may result in RL agents observing higher (or lower) rewards than they should. For example, a reinforcement learning agent may prefer states where a sensory error gives it the maximum reward, but where the true reward is actually small. Two ways around the problem are investigated.
Series This talk is part of the Engineering Safe AI series.
Included in Lists
- Cambridge talks
- Cambridge University Engineering Department, CBL Seminar room BE4-38. For directions see http://learning.eng.cam.ac.uk/Public/Directions
- Chris Davis' list
- Engineering Safe AI
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Tom McGrath, Imperial College London
Wednesday 29 November 2017, 17:00-18:30