Adrià Garriga Alonso
| Name: | Adrià Garriga Alonso |
| Affiliation: | University of Cambridge |
| E-mail: | (only provided to users who are logged into talks.cam) |
| Last login: | 6 Jun 2022, 2:03 a.m. |
Public lists managed by Adrià Garriga Alonso
Talks given by Adrià Garriga Alonso
Obviously this only lists talks that are listed through talks.cam. Furthermore, this facility only works if the speaker's e-mail was specified in a talk. Most talks have not done this.
- Neural Tangent Kernel
- Goals vs Utility Functions
- Ambitious Value Learning
- Embedded Agency
- Comprehensive AI Services
- Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning
- Measuring and avoiding side effects using relative reachability
- Approaches to avoiding negative side effects
- Logical Induction: a computable approach to logical non-omniscience
- Last term summary + discussion of topic importance
- Counterargument to CIRL, and Safely Interruptible Agents
Talks organised by Adrià Garriga Alonso
This list is based on what was entered into the 'organiser' field in a talk. It may not mean that Adrià Garriga Alonso actually organised the talk, they may have been responsible only for entering the talk into the talks.cam system.
- Two Approximate Sampling Methods for Bayesian Deep Learning
- Can Machines Read our Minds?
- How useful is quantilization for mitigating specification-gaming?
- Misleading meta-objectives and hidden incentives for distributional shift
- Causal Reasoning from Meta-reinforcement Learning
- Inverse Game Theory
- Goals vs Utility Functions
- Who do we want to control human-level AI?
- Bayesian Theory of Mind: Modeling Joint Belief-Desire Attribution
- Ambitious Value Learning
- Machine Theory of Mind
- Embedded Agency
- Observation and Intervention Incentives in Causal Influence Diagrams: Towards an Understanding of Powerful Machine Learning Systems
- Comprehensive AI Services
- Incomplete Contracting and AI Alignment
- The Algorithmic Foundations of Differential Privacy (Chapters 1 and 2)
- Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning
- Measuring and avoiding side effects using relative reachability
- Interpretable Machine Learning
- Scaling inverse reinforcement learning for human-compatible AI
- Motivation for this group, Goodhart's Law
- Approaches to avoiding negative side effects
- AI Safety via Debate
- AI Safety Gridworlds: Is my agent 'safe'?
- Logical Induction: a computable approach to logical non-omniscience
- Decision Boundary Geometries and Robustness of Neural Networks
- Decision Theory for AI safety
- Safe Exploration in Reinforcement Learning
- Amplification and dialogue as mechanisms for safe advanced AI
- Last term summary + discussion of topic importance
- Counterargument to CIRL, and Safely Interruptible Agents
- Reinforcement learning with a corrupted reward function
- Deep Reinforcement Learning from Human Preferences
- An introduction to adversarial attacks and defences
![[Talks.cam]](/static/images/talkslogosmall.gif)
