BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Talks.cam//talks.cam.ac.uk//
X-WR-CALNAME:Talks.cam
BEGIN:VEVENT
SUMMARY:Optimal Sequential Inference and Decision Making: From Uniform A/B
  Testing to Gradient-Based Bandits - Patrick Rebeschini (University of Oxf
 ord)
DTSTART:20251110T105000Z
DTEND:20251110T113000Z
UID:TALK238420@talks.cam.ac.uk
DESCRIPTION:Sequential data collection and decision making lie at the core
  of modern statistical learning\, from anytime-valid A/B testing in online
  experiments to adaptive multi-armed bandit problems. This talk presents r
 ecent advances in the design and analysis of optimal algorithms for these 
 two settings. In the first part\, I will introduce a framework for anytime
 -valid\, variance-adaptive inference in monotonic processes--such as cumul
 ative distribution functions--that builds on the coin-betting paradigm of 
 game-theoretic statistics and integrates PAC-Bayesian principles to yield 
 tight hypothesis tests that are uniform not only in time but also in space
 . In the second part\, I will focus on stochastic gradient bandits\, a fun
 damental policy-gradient approach to online decision making\, and present 
 theoretical results showing how the learning rate governs the algorithm&rs
 quo\;s regret\, revealing sharp thresholds that separate logarithmic and p
 olynomial regimes and depend on the (unknown) sub-optimality gap.\n(Based 
 on joint work with E. Clerico and H. E. Flynn\, and with D. Baudry\, E. Jo
 hnson\, S. Vary\, and C. Pike-Burke)
LOCATION:Seminar Room 1\, Newton Institute
END:VEVENT
END:VCALENDAR