BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Computational Neuroscience Journal Club - Changmin Yu ( Gatsby Com
 putational Neuroscience Unit\, UCL\, London\, UK)
DTSTART:20240214T140000Z
DTEND:20240214T160000Z
UID:TALK212311@talks.cam.ac.uk
CONTACT:Puria Radmard
DESCRIPTION:Please join us for our Computational Neuroscience journal club
  on Wednesday 14th February at 2pm UK time in the CBL seminar room\, or on
 line on zoom.\n\nThe title is ‘Distributional Reinforcement Learning’\
 , presented by Changmin Yu and Puria Radmard.\n\nSummary:\n\nIn traditiona
 l reinforcement learning algorithms such as temporal difference learning\,
  the value functions maps states to the expected total future return. In d
 istributional reinforcement learning\, this is extended to include the mul
 tiplicity of rewards\, by mapping states to full distributions of returns.
  In this session\, Changmin and Puria will start with an introduction to b
 oth traditional [1] and distributional [2\, 3] reinforcement learning. Dab
 ney et al.\, 2019 [4]\, show the distributional nature of value representa
 tion in VTA dopaminergic neurons\, and the simple changes to classical TD 
 learning that can bring about distributional value representations. Recent
  discoveries showed that midbrain dopaminergic neurons exhibit distributio
 nal value coding\, which suggests the underlying mechanisms for such neuro
 ns to follow the distributional rather than classical expectation-based re
 inforcement learning regime. Prefrontal cortex neurons have been shown to 
 be significantly involved in decision-making and reward-guided learning\, 
 and are anatomically related with the dopaminergic neurons. Muller et al. 
 2024 [5] present new analyses of existing data of primate prefrontal neuro
 ns in decision making tasks\, showing that similar to what was found in ro
 dent dopamine neurons [4]\, PFC neurons exhibit highly diverse profiles in
  optimism with respect to value coding\, and in asymmetric scaling relativ
 e to positive versus negative RPEs. Moreover\, in a task with dynamic rewa
 rd structure\, the authors show diversity in the rate of learning associat
 ed with positive and negative RPEs\, hinting on the computational nature o
 f distributional RL in the PFC for decision-making.\n\n[1] Dayan\, P. and 
 Abbott\, L.F. (2001) Theoretical Neuroscience: Computational and Mathemati
 cal Modeling of Neural Systems. The MIT Press\, Cambridge.\n[2] Bellemare\
 , Marc G.\, Will Dabney\, and Rémi Munos. "A distributional perspective o
 n reinforcement learning." In International conference on machine learning
 \, pp. 449-458. PMLR\, 2017. \n[3] Dabney\, Will\, Mark Rowland\, Marc Bel
 lemare\, and Rémi Munos. "Distributional reinforcement learning with quan
 tile regression." In Proceedings of the AAAI Conference on Artificial Inte
 lligence\, vol. 32\, no. 1. 2018. \n[4] Dabney\, W.\, Kurth-Nelson\, Z.\, 
 Uchida\, N. et al. A distributional code for value in dopamine-based reinf
 orcement learning. Nature 577\, 671–675 (2020).\n[5] Muller\, T.H.\, But
 ler\, J.L.\, Veselic\, S. et al. Distributional reinforcement learning in 
 prefrontal cortex. Nat Neurosci (2024).\n
LOCATION:CBL Seminar Room\, Engineering Department\, 4th floor Baker build
 ing
END:VEVENT
END:VCALENDAR
