Revisiting Optimism and Model Complexity in the Wake of Overparameterized Machine Learning
Speaker: Ryan Tibshirani, University of California, Berkeley
Abstract: Common practice in modern machine learning involves fi tting a large number of parameters relative to the number of observations. These overparameterized models can exhibit surprising generalization behavior, e.g., “double descent” in the prediction error curve when plotted against the raw number of model parameters, or another simplistic notion of complexity. In this paper, we revisit model complexity from fi rst principles, by fi rst reinterpreting and then extending the classical statistical concept of (effective) degrees of freedom. Whereas the classical defi nition is connected to fi xed-X prediction error (in which prediction error is defi ned by averaging over the same, nonrandom covariate points as those used during training), our extension of degrees of freedom is connected to random-X prediction error (in which prediction error is averaged over a new, random sample from the covariate distribution). The random-X setting more naturally embodies modern machine learning problems, where highly complex models, even those complex enough to interpolate the training data, can still lead to desirable generalization performance under appropriate conditions. We demonstrate the utility of our proposed complexity measures through a mix of conceptual arguments, theory, and experiments, and illustrate how they can be used to interpret and compare arbitrary prediction models.
This is joint work with Pratik Patil and Jin-Hong Du.
This is joint work with Pratik Patil and Jin-Hong Du.
Bio: Ryan Tibshirani is a professor in the Department of Statistics at the University of California, Berkeley. His research interests lie broadly in statistics, machine learning (ML), and optimization and more specifi cally in high-dimensional statistics, nonparametric estimation, distribution-free inference, convex optimization, and numerical methods with applied interests on tracking and forecasting epidemics. Ryan completed both his BS in Mathematics and a PhD in Statistics at Stanford University.
Tibshirani received the COPSS President's Award in 2023. Given jointly by the world's leading statistical societies, the award recognizes outstanding contributions to statistics by a statistician under the age of 40. He was also elected as a fellow of Institute of Mathematical Statistics (IMS) in 2022 and is an Amazon Scholar.
Tibshirani received the COPSS President's Award in 2023. Given jointly by the world's leading statistical societies, the award recognizes outstanding contributions to statistics by a statistician under the age of 40. He was also elected as a fellow of Institute of Mathematical Statistics (IMS) in 2022 and is an Amazon Scholar.