STAT 155 Review

COMPREHENSIVE REVIEW

A comprehensive STAT 155 review is provided by the Prof. Johnson’s Spring 2022 STAT 155 manual here and the STAT 155 Notes created by Profs. Grinde, Heggeseth, and Myint here.



QUICK REVIEW

Let \(y\) be a response variable with a set of \(k\) explanatory variables \(x = (x_{1}, x_{2}, ..., x_{k})\). Then the population linear regression model is

\[\begin{split} y & = f(x) + \varepsilon = \beta_0 + \beta_1 x_{1} + \beta_2 x_{2} + \cdots + \beta_k x_{k} + \varepsilon \\ \end{split}\]

NOTES:




Fitting the Model

Once we have a population model in mind, we can “fit the model” (i.e. estimate the \(\beta\) population coefficients) using sample data:

\[\begin{split} y & = \hat{f}(x) + \varepsilon \\ & = \hat{\beta}_0 + \hat{\beta}_1 x_{1} + \hat{\beta}_2 x_{2} + \cdots + \hat{\beta}_k x_{k} + \varepsilon \\ \end{split}\]


To this end, collect a sample of data on \(n\) subjects. Use subscripts to denote the data for subject \(i\): \(y_i\) and \(x_{ij}\). Then the predicted response and residual (prediction error) for subject \(i\) are



Least Squares Criterion

Estimate (\(\beta_0, \beta_1,..., \beta_k\)) by (\(\hat{\beta}_0, \hat{\beta}_1,..., \hat{\beta}_k)\) that minimize the sum of squared residuals: \[\sum_{i=1}^n(y_i - \hat{y}_i)^2 = (y_1-\hat{y}_1)^2 + (y_2-\hat{y}_2)^2 + \cdots + (y_n-\hat{y}_n)^2\]