Generalisation for Adaptive Data Analysis
- đ¤ Speaker: Thomas Steinke (IBM Almaden Research Center)
- đ Date & Time: Tuesday 22 November 2016, 15:30 - 16:30
- đ Venue: Seminar Room 2, Newton Institute
Abstract
Adaptivity is an important aspect of data analysis—that is, the choice of questions to ask about a dataset is often informed by previous use of the same dataset. However, statistical validity is typically only guaranteed in a non-adaptive model, in which the questions must be specified before the dataset is collected. A recent line of work initiated by Dwork et al. (STOC 2015) provides a formal model for studying the power of adaptive data analysis. This talk will show that there are sophisticated techniques—using tools from information theory and differential privacy—that enable us to ensure that adaptive data analysis provides statistically valid answers that generalise to the overall population from which the dataset was drawn. This talk will also discuss how adaptive data analysis is inherently more powerful than non-adaptive data analysis, namely there is an exponential separation between the number of adaptive queries needed to overfit a dataset and the number of non-adaptive queries needed.
Series This talk is part of the Isaac Newton Institute Seminar Series series.
Included in Lists
- All CMS events
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- dh539
- Featured lists
- INI info aggregator
- Interested Talks
- Isaac Newton Institute Seminar Series
- ndk22's list
- ob366-ai4er
- rp587
- School of Physical Sciences
- Seminar Room 2, Newton Institute
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Thomas Steinke (IBM Almaden Research Center)
Tuesday 22 November 2016, 15:30-16:30