University of Cambridge > Talks.cam > Engineering Safe AI > Incomplete Contracting and AI Alignment

Incomplete Contracting and AI Alignment

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Adrià Garriga Alonso.

This week, Paolo Bova will lead the discussion on “Incomplete Contracting and AI Alignment’ Link:https://arxiv.org/pdf/1804.04268.pdf

The authors argue that we can frame the design of the reward function for a learning agent as a contract. However, sources of incompleteness result in a misalignment of the rewards functions of the agent and the values of the human. Incomplete Contract Theory helps us think about economic contracts when facing bounded rationality, planned renegotiation, and strategic behaviour and all have analogues in the AI case. Therefore, Incomplete Contract Theory provides insight into tackling AI Contracts.

The paper even argues that Incomplete Contract Theory has relevance for Super Intelligent AI. Finally, they discuss the legal scholars’ insight, called ‘normative social order’, as a prerequisite for AI to absorb human values as we do.

This week we’re mixing up the format to include questions now to better understand the paper and what we should take away from it:

Comprehension Questions: 1. What are the main sources of incompleteness in designing contracts (Section 3)? 2. How does incomplete contract theory address these problems? Which AI applications do the author’s mention use this theory? (try to describe at least one – Section 4) 3. What are the extra challenges faced by ‘Strongly Strategic’ AI? Which solutions do the authors suggest in these cases? (try to describe at least one – Section 4) 4. Describe the concept of ‘Normative Social Order’ (Section 5)? Why do the authors believe it matters for AI Safety?

We will also split into smaller groups at times to focus on learning parts of the paper and then teach the content back to the whole group.

There will be free pizza. At 17:00, we will start reading the paper, mostly individually. At 17:30, the discussion leader will start going through the paper, making sure everyone understands, and encouraging discussion about its contents and implications.

Even if you think you cannot contribute to the conversation, you should give it a try. Last year we had several people from non-computer-y backgrounds, and others who hadn’t thought about alignment before, that ended up being essential. If you have already read the paper in your own time you can come in time for the discussion.

A basic understanding of machine learning is helpful, but detailed knowledge of the latest techniques is not required. Each session will have a brief recap of immediate necessary knowledge. The goal of this series is to get people to know more about the existing work in AI research, and eventually contribute to the field.

Invite your friends to join the mailing list (https://lists.cam.ac.uk/mailman/listinfo/eng-safe-ai), the Facebook group (https://www.facebook.com/groups/1070763633063871) or the talks.cam page (https://talks.cam.ac.uk/show/index/80932). Details about the next meeting, the week’s topic and other events will be advertised in these places.

This talk is part of the Engineering Safe AI series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2024 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity