Evaluating Data Linkage: Creating longitudinal synthetic data to provide a gold-standard linked dataset
- đ¤ Speaker: Tom Dalton (University of St Andrews)
- đ Date & Time: Thursday 20 October 2016, 15:30 - 16:30
- đ Venue: Seminar Room 2, Newton Institute
Abstract
When performing probabilistic data linkage on real world data we, by the fact we need to link it, do not know the true linkage. Therefore, the success of our linkage approach is difficult to evaluate. Often small hand linked datasets are used as a ‘gold-standard’ for the linkage approach to be evaluated against. However, errors in the hand-linkage and the limited size and number of these datasets do not allow for robust evaluation. The research focuses on the creation of longitudinal synthetic datasets for the domain of population reconstruction. In this talk I will cover the previous and current models we have created to achieve this and detail the approaches to how we: define the desired behaviour in the model to avoid clashes between input distributions, verify the statistical correctness of the population, and initialise the model such that the starting population meets the temporal requirements of the desired behaviour. To conclude I will outline the model’s intended use for linkage evaluation, its other potential uses and also take questions.
Series This talk is part of the Isaac Newton Institute Seminar Series series.
Included in Lists
- All CMS events
- bld31
- Cambridge Centre for Data-Driven Discovery (C2D3)
- Cambridge talks
- Chris Davis' list
- dh539
- Featured lists
- INI info aggregator
- Interested Talks
- Isaac Newton Institute Seminar Series
- ndk22's list
- ob366-ai4er
- rp587
- School of Physical Sciences
- Seminar Room 2, Newton Institute
- Trust & Technology Initiative - interesting events
- yk449
Note: Ex-directory lists are not shown.
![[Talks.cam]](/static/images/talkslogosmall.gif)

Tom Dalton (University of St Andrews)
Thursday 20 October 2016, 15:30-16:30