University of Cambridge > Talks.cam > Engineering Safe AI > AI Safety via Debate

AI Safety via Debate

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact AdriĆ  Garriga Alonso.

The seminar will be on OpenAI’s new AI safety model, https://blog.openai.com/debate/ .

How can we augment humans so that they can effectively supervise advanced AI systems? One way is to take advantage of the AI itself to help with the supervision, asking the AI (or a separate AI) to point out flaws in any proposed action. To achieve this, we reframe the learning problem as a game played between two agents, where the agents have an argument with each other and the human judges the exchange. Even if the agents have a more advanced understanding of the problem than the human, the human may be able to judge which agent has the better argument (similar to expert witnesses arguing to convince a jury).

Bring a laptop if you can, so we can play the cat/dog game! https://debate-game.openai.com/

This talk is part of the Engineering Safe AI series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity