University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Advanced Techniques for Privacy-Preserving Linking of Multiple Large Databases

Advanced Techniques for Privacy-Preserving Linking of Multiple Large Databases

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact info@newton.ac.uk.

DLAW02 - Data Linkage: Techniques, Challenges and Applications

Co-author: Peter Christen (The Australian National University)

In the era of Big Data the collection of person-specific data disseminated in diverse databases provides enormous opportunities for businesses and governments by exploiting data linked across these databases. Linked data empowers quality analysis and decision making that is not possible on individual databases. Therefore, linking databases is increasingly being required in many application areas, including healthcare, government services, crime and fraud detection, national security, and business applications. Linking data from different databases requires comparison of quasi-identifiers (QIDs), such as names and addresses. These QIDs are personal identifying attributes that contain sensitive and confidential information about the entities represented in these databases. The exchange or sharing of QIDs across organisations for linkage is often prohibited due to laws and business policies. Privacy-preserving record linkage (PPRL) has been an active research area over the past two decades addressing this problem through the development of techniques that facilitate the linkage on masked (encoded) records such that no private or confidential information needs to be revealed.

Most of the work in PPRL thus far has concentrated on linking two databases only. Linking multiple databases has only recently received more attention as it is being required in a variety of application areas. We have developed several advanced techniques for practical PPRL of multiple large databases addressing the scalability, linkage quality, and privacy challenges. Our approaches perform linkage on masked records using Bloom filter encoding, which is a widely used masking technique for PPRL . In this talk, we will first highlight the challenges of PPRL of multiple databases, then describe our developed approaches, and then discuss future research directions required to leverage the huge potential that linked data from multiple databases can provide for businesses and government services.

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2017 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity