Learning Log 5

Today we covered the concept of Multiple Linear Regression. This takes simple linear regression a step further by taking the limit off of the number of predictor variables you can have. The equation for multiple linear regression has the intercept (beta 0) then each predictor accompanied by its regression parameter (beta1,beta2,…,betak), for up to k predictor variables.

We then covered the Sum of Squared Residuals (SSE) and Mean Squared Error (MSE) for this new model. Also covering the Multiple Coefficient of Determination (R^2). This is found by dividing the explained variation by the total variation. The R^2 value of a certain model does not decrease with the addition of of predictors.

When covering the F-statistic, the question was asked whether on good predictor has a smaller f-value than 1 good predictor and 20 bad predictors. It was concluded that the f-value for the single predictor would be smaller because as df1 increases, the F distribution is stretched to the right, and when df2 decreases, the distribution is compressed vertically.This leaves the 21 predictors with the larger f-value.

Finally we talking about indicator functions, which basically is the way to track qualitative predictors. With a 2 category predictor, one answer is associated with the value 0, and the other associated with the number 1. This gives the model a way to differentiate between the 2 different options of the qualitative variable.

The most helpful example for me was the zebra mussels example. This was a good introduction into reading the R outputs and understanding what the values meant.