University of Cambridge > Talks.cam > Isaac Newton Institute Seminar Series > The xyz algorithm for fast interaction search in high-dimensional data

The xyz algorithm for fast interaction search in high-dimensional data

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact info@newton.ac.uk.

STSW01 - Theoretical and algorithmic underpinnings of Big Data

When performing regression on a dataset with $p$ variables, it is often of interest to go beyond using main effects and include interactions as products between individual variables. However, if the number of variables $p$ is large, as is common in genomic datasets, the computational cost of searching through $O(p2)$ interactions can be prohibitive. In this talk I will introduce a new randomised algorithm called xyz that is able to discover interactions with high probability and under mild conditions has a runtime that is subquadratic in $p$. The underlying idea is to transform interaction search into a much simpler close pairs of points problem. We will see how strong interactions can be discovered in almost linear time, whilst finding weaker interactions requires $O(pu)$ operations for $1<u<2$ depending on their strength. An application of xyz to a genome-wide association study shows how more than $10^{11}$ interactions can be screened in minutes using a standard laptop. This is joint work with Gian Thanei and Nicolai Meinshausen (ETH Zurich).

This talk is part of the Isaac Newton Institute Seminar Series series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

© 2006-2018 Talks.cam, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity