BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Counterargument to CIRL\, and Safely Interruptible Agents - Adrià
  Garriga Alonso (University of Cambridge)
DTSTART:20171206T170000Z
DTEND:20171206T183000Z
UID:TALK96817@talks.cam.ac.uk
CONTACT:Adrià Garriga Alonso
DESCRIPTION:Cooperative Inverse Reinforcement Learning (CIRL) is a game wi
 th a robot R and human H\, in which R tries to maximise H's reward while n
 ot knowing it. R is incentivised to shut down on H's suggestion\, since th
 at provides information about the H's reward function. However\, Carey (20
 17) shows that\, if R and H do not share the same prior for the reward\, R
  may remain incorrigible. Carey then makes a case for _forced_ interruptib
 ility. We will talk about Carey's examples and the strength of the case fo
 r forced interruptibility.\n\nOrseau and Armstrong (2016) provide a formal
  notion of satisfactory learning under forced interruptions. Then they sho
 w how Q-learning satisfies it\, and SARSA and AIXI-with-exploration can be
  modified to satisfy it. We will go over the proof outlines and discuss th
 eir implications for corrigibility.\n\nReading list:\n\nRyan Carey. 2017. 
 "Incorrigibility in the CIRL Framework." arXiv:1709.06275 [cs.AI].\n\nLaur
 ent Orseau and Stuart Armstrong. 2016. "Safely Interruptible Agents." Pape
 r presented at the 32nd Conference on Uncertainty in Artificial Intelligen
 ce.\n\nSlides: https://valuealignment.ml/talks/2017-12-06-interruptibility
 .pdf
LOCATION: Cambridge University Engineering Department\, CBL Seminar room B
 E4-38.  For directions see http://learning.eng.cam.ac.uk/Public/Directions
END:VEVENT
END:VCALENDAR