Efficient Learning Algorithms under (Heavy) Contamination

Thursday, April 16, 2026 - 4:00pm to 5:00pm
Location: 
32-G882 (Hewlett)
Speaker: 
Kostas Stavropoulos
Biography: 
https://www.kstavrop.com/
In this talk, I will present a series of new results in supervised learning from contaminated datasets, based on a general outlier removal algorithm inspired by recent work on learning with distribution shift.
 
We show that any function class that can be approximated by low-degree polynomials with respect to a hypercontractive distribution can be efficiently learned under bounded contamination (also known as nasty noise). This resolves a longstanding gap between the complexity of agnostic learning and learning with contamination, even though it was widely believed that low-degree approximators only implied tolerance to label noise.
 
Moreover, for any function class that admits the (stronger) notion of sandwiching approximators, we obtain near-optimal learning guarantees even with respect to heavy additive contamination, where far more than 1/2 of the training set may be added adversarially. Prior related work held only for regression and in a list-decodable setting.
 
These results significantly advance our understanding of efficient supervised learning under contamination, a setting that has been much less studied than its unsupervised counterpart.