Peter Bartlett (University of California, Berkeley)
Occasion: The Multifaceted Complexity of Machine Learning
Date: April 12, 2021
Abstract: Deep learning methodology has revealed some major surprises from the perspective of statistical complexity: even without any explicit effort to control model complexity, these methods find prediction rules that give a near-perfect fit to noisy training data and yet exhibit excellent prediction performance in practice. We investigate this phenomenon of ‘benign overfitting’ in the setting of linear prediction and give a characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy. The characterization shows that overparameterization is essential: the number of directions in parameter space that are unimportant for prediction must be large compared to the sample size. We discuss implications for deep networks, for robustness to adversarial examples, and for the rich variety of possible behaviors of excess risk as a function of dimension, and we describe extensions to ridge regression and barriers to analyzing benign overfitting based on model-dependent generalization bounds.
Joint work with Phil Long, Gábor Lugosi, and Alex Tsigler.