University of Cambridge > Talks.cam > Engineering Safe AI > Misleading meta-objectives and hidden incentives for distributional shift

Log in

University Account

External (via Google)

Information on

Subscribing to talks details

Finding a talk details

Adding a talk details

Disseminating talks details

Help and Documentation details

Misleading meta-objectives and hidden incentives for distributional shift

Download to your calendar using vCal

Paolo Bova (University of Cambridge)
Wednesday 08 May 2019, 17:00-19:00
Engineering Department, CBL Seminar room BE4-38.

If you have a question about this talk, please contact Adrià Garriga Alonso .

This week: “Misleading meta-objectives and hidden incentives for distributional shift.” David Krueger, Tegan Maharaj, Shane Legg and Jan Leike. [Paper] [BibTeX]

The authors aim to show that Meta-Learning can create hidden incentives for agents to change their task rather than solving the task we tell them to. An example would be an agent that predicts when someone wants coffee: after learning that the person has coffee in the morning they learn to wake them up when they try to sleep in, so following a seemingly suboptimal policy (wake up the human) results in a better prediction. Their paper runs experiments to show that Meta-Learning agents with Population-Based Training (PBT) learn to exhibit non-myopic behaviour even when their reward is myopic. They also demonstrate for these agents a method for eliminating this non-myopic behaviour that they call Environment Swapping.

As always, there will be free pizza. The first half hour is for stragglers to finish reading.

Invite your friends to join the mailing list (https://lists.cam.ac.uk/mailman/listinfo/eng-safe-ai), the Facebook group (https://www.facebook.com/groups/1070763633063871) or the talks.cam page (https://talks.cam.ac.uk/show/index/80932). Details about the next meeting, the week’s topic and other events will be advertised in these places.

This talk is part of the Engineering Safe AI series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Misleading meta-objectives and hidden incentives for distributional shift

📅 Download to calendar (vCal)

👤 Speaker: Paolo Bova (University of Cambridge)
📅 Date & Time: Wednesday 08 May 2019, 17:00 - 19:00
📍 Venue: Engineering Department, CBL Seminar room BE4-38

Questions? Contact Adrià Garriga Alonso

Abstract

This week: “Misleading meta-objectives and hidden incentives for distributional shift.” David Krueger, Tegan Maharaj, Shane Legg and Jan Leike. [Paper] [BibTeX]

As always, there will be free pizza. The first half hour is for stragglers to finish reading.

Series This talk is part of the Engineering Safe AI series.

Included in Lists

Note: Ex-directory lists are not shown.

Log in

🔐 Log In

Information on

ℹ️ Information

Misleading meta-objectives and hidden incentives for distributional shift

This talk is included in these lists:

Misleading meta-objectives and hidden incentives for distributional shift

Abstract

Included in Lists

Log in

🔐 Log In

Information on

ℹ️ Information

Misleading meta-objectives and hidden incentives for distributional shift

This talk is included in these lists:

Other lists

Other talks

Misleading meta-objectives and hidden incentives for distributional shift

Abstract

Included in Lists