The High-Dimensional Asymptotics of Principal Components Regression
We study principal components regression (PCR) in an asymptotic high-dimensional setting, where the number of data points is proportional to the dimension. We derive exact limiting formulas for estimation and prediction risk, which depend in a complicated manner on the eigenvalues of the population covariance, the alignment between the population PCs and the true signal, and the number of selected PCs. A key challenge in the high-dimensional setting stems from the fact that the sample covariance is an inconsistent estimate of its population counterpart, so that sample PCs may fail to fully capture potential latent low-dimensional structure in the data. We demonstrate this point through several case studies, including that of a spiked covariance model.
https://wustl.zoom.us/j/96191891476?pwd=2tz9ytsePEazPJeSUGvFIkTCbaUbl3.1
Meeting ID: 961 9189 1476
Passcode: 193995