Presentation 8.7 - Data 606

8.7 Baby weights, Part IV.

Exercise 8.3 considers a model that predicts a newborn’s weight using several predictors (gestation length, parity, age of mother, height of mother, weight of mother,smoking status of mother). The table below shows the adjusted R-squared for the full model as well as adjusted R-squared values for all models we evaluate in the first step of the backwards elimination process.

Which, if any, variable should be removed from the model first?

The Answer:

library(knitr)
Model<-c("Full_model","No_gestation","No_parity","No_age","No_height","No_weight","No_smoking_status")
R_squared<-c(0.2541,0.1031,0.2492,0.2547,0.2311,0.2536,0.2072)

df<-data.frame(Model,R_squared)

kable(df[rev(order(df$R_squared)),])

	Model	R_squared
4	No_age	0.2547
1	Full_model	0.2541
6	No_weight	0.2536
3	No_parity	0.2492
5	No_height	0.2311
7	No_smoking_status	0.2072
2	No_gestation	0.1031

We notice that 6 out of 7 models have their \(R^2\) between 20% and 25.5%: \(20\)% < \(R^2\) < \(25.5\)% and 1 around 10%.
The variables involved in the given list look non-time series variables (nonstationary - better when close to 1), meaning the R-squared of each variable is better considered if 25% or less.
Therefore, we can conclude from the reversed sorted list that “No age” model has the highest \(R^2_{adj}\) value that should be removed for better \(R^2\) results.

Presentation 8.7 - Data 606

Ohannes (Hovig) Ohannessian

5/1/2018

8.7 Baby weights, Part IV.

The Answer: