Abstract: There is a deep analogy between statistical inference — where we try to fit a model to data, or (even better) understand the posterior distribution of models given the data — and statistical physics, where we define a probability distribution in terms of some energy function. Many concepts like energy landscapes, partition functions, free energy, the cavity method, and phase transitions can be usefully carried over from physics to machine learning and computer science. At the very least, these techniques are a source of conjectures that have stimulated new work in machine learning, computer science, and probability theory; at their best, they offer strong intuitions about the structure of the problem and its possible solutions.
One recent success of this flow of ideas is the discovery of a sharp phase transition in community detection in sparse graphs. This appeared in Decelle et al. in the physics literature, and was then made rigorous in beautiful work by Mossel, Neeman, and Sly, and Massoulie. Analogous transitions exist in many other inference problems, where our ability to find patterns in data jumps suddenly as a function of how noisy or how sparse they are.
I will discuss why and how the detectability transition occurs, review what is now known rigorously, and present a number of open questions that cry out for proofs. Perhaps more importantly, I will take full advantage of the excellent format of this seminar to dwell on the analogy between physics and inference, and discuss how a thermodynamic point of view helps us find structure in data, test whether this structure is statistically significant, and choose between competing models.
This is based on joint work with many people, including Aurelien Decelle, Florent Krzakala, Lenka Zdeborova, and Pan Zhang.