We study problems in distribution property testing: Given sample access to one or more unknown discrete distributions, we want to determine whether they have some global property or are ε-far from having the property in ℓ1 distance. We provide a simple and general approach to obtain upper bounds in this setting, by reducing ℓ1-testing to ℓ2-testing. Our reduction yields optimal ℓ1-testers, by using a standard ℓ2-tester as a black-box.
Using our framework, we obtain optimal estimators for a wide variety of ℓ1 distribution testing problems, including the following: identity testing to a fixed distribution, closeness testing between two unknown distributions (with equal/unequal sample sizes), independence testing (in any number of dimensions), closeness testing for collections of distributions, and testing k-flatness. For several of these problems, we give the first optimal tester in the literature. Moreover, our estimators are significantly simpler to state and analyze compared to previous approaches.
As our second main contribution, we provide a direct general approach for proving distribution testing lower bounds, by bounding the mutual information. Our lower bound approach is not restricted to symmetric properties, and we use it to prove tight lower bounds for all the aforementioned problems.
Joint work with Daniel Kane.