Misleading meta-objectives and hidden incentives for distributional shift
- ๐ค Speaker: Paolo Bova (University of Cambridge)
- ๐ Date & Time: Wednesday 08 May 2019, 17:00 - 19:00
- ๐ Venue: Engineering Department, CBL Seminar room BE4-38
Abstract
This week: “Misleading meta-objectives and hidden incentives for distributional shift.” David Krueger, Tegan Maharaj, Shane Legg and Jan Leike. [Paper] [BibTeX]
The authors aim to show that Meta-Learning can create hidden incentives for agents to change their task rather than solving the task we tell them to. An example would be an agent that predicts when someone wants coffee: after learning that the person has coffee in the morning they learn to wake them up when they try to sleep in, so following a seemingly suboptimal policy (wake up the human) results in a better prediction. Their paper runs experiments to show that Meta-Learning agents with Population-Based Training (PBT) learn to exhibit non-myopic behaviour even when their reward is myopic. They also demonstrate for these agents a method for eliminating this non-myopic behaviour that they call Environment Swapping.
As always, there will be free pizza. The first half hour is for stragglers to finish reading.
Invite your friends to join the mailing list (https://lists.cam.ac.uk/mailman/listinfo/eng-safe-ai), the Facebook group (https://www.facebook.com/groups/1070763633063871) or the talks.cam page (https://talks.cam.ac.uk/show/index/80932). Details about the next meeting, the weekโs topic and other events will be advertised in these places.
Series This talk is part of the Engineering Safe AI series.
Included in Lists
- Cambridge talks
- Chris Davis' list
- Engineering Department, CBL Seminar room BE4-38
- Engineering Safe AI
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)


Wednesday 08 May 2019, 17:00-19:00