# Optimal rates for independence testing via U-statistic permutation tests

Independence testing is one of the most well-studied problems in statistics, and the use of procedures such as the chi-squared test is ubiquitous in the sciences. While tests have traditionally been calibrated through asymptotic theory, permutation tests are experiencing a growth in popularity due to their simplicity and exact Type I error control. In this talk I will present new, finite-sample results on the power of a new class of permutation tests, which show that their power is optimal in many interesting settings, including those with discrete, continuous, and functional data. A simulation study shows that our test for discrete data can significantly outperform the chi-squared for natural data-generating distributions.

Defining a natural measure of dependence $D(f)$ to be the squared $L2$-distance between a joint density $f$ and the product of its marginals, we first show that there is generally no valid test of independence that is uniformly consistent against alternatives of the form $\{f: D(f) \geq \rho2 \}$. Motivated by this observation, we restrict attention to alternatives that satisfy additional Sobolev-type smoothness constraints, and consider as a test statistic a U-statistic estimator of $D(f)$. Using novel techniques for studying the behaviour of U-statistics calculated on permuted data sets, we prove that our tests can be minimax optimal. Finally, based on new normal approximations in the Wasserstein distance for such permuted statistics, we also provide an approximation to the power function of our permutation test in a canonical example, which offers several additional insights.

This is joint work with Ioannis Kontoyiannis and Richard Samworth.

This talk is part of the Statistics series.