Motivating Question
Question
The field of machine learning is most often associated with the building of predictive models, not inferential models. Specifically, the goal is to build a model which produces good predictions of our response variable
If we have access to a bunch of potential predictors
, how can we decide which model to build?
Model Selection Methods
Variable selection Identify a subset of predictors to use in our model of
. Methods: best subset selection, backward stepwise selection, forward stepwise selectionShrinkage / regularization Shrink / regularize the coefficients of all predictors toward or to 0. Methods: LASSO, ridge regression, elastic net (a combination of LASSO & ridge)
Dimension reduction Combine the predictors into a smaller set of new predictors. Methods: principal components regression