University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > Big data integration: challenges and new approaches

Big data integration: challenges and new approaches

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact INI IT.

DLAW02 - Data linkage: techniques, challenges and applications

Data integration is a key challenge for Big Data applications to semantically enrich and combine large sets of heterogeneous data for enhanced data analysis. In many cases, there is also a need to deal with a very high number of data sources, e.g., product offers from many e-commerce websites. We will discuss approaches to deal with the key data integration tasks of (large-scale) entity resolution and schema matching. In particular, we discuss parallel blocking and entity resolution on Hadoop platforms together with load balancing techniques to deal with data skew. We also discuss challenges and recent approaches for holistic data integration of many data sources, e.g., to create knowledge graphs or to make use of huge collections of web tables.

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2021 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity