Classical results from random matrix theory state that if a random (Wigner or Wishart) matrix is perturbed by a planted rank-1 signal (or "spike"), this spike affects the top eigenvalue if and only if the size of the spike exceeds a particular threshold. In particular, above this threshold the top eigenvalue is a reliable test to distinguish the spiked and un-spiked models. We consider the question of whether this eigenvalue test is optimal, or whether there is some other statistical test that can succeed below the eigenvalue threshold.
We answer these types of questions using a simple and widely-applicable technique associated with the statistical notion of "contiguity." Namely, by computing a particular second moment, one can show that two models are contiguous and therefore cannot be reliably distinguished.
Our results include:
i) For the Gaussian Wigner ensemble, we show that PCA achieves the optimal detection threshold for a variety of natural priors for the spike.
ii) For any non-Gaussian Wigner ensemble, we show that PCA is always sub-optimal for detection (contrary to the notion of universality from random matrix theory). However, a variant of PCA achieves the optimal threshold (for natural priors) by pre-transforming the matrix entrywise.
iii) For the Wishart ensemble we show that in some regimes, inefficient algorithms can succeed below the PCA threshold, whereas no known efficient algorithm achieves this.