picture

  1. Write out the form of the model. Also identify which of the variables are positively associated when controlling for other variables.

picture = 33.5095 -1.4207 * sex_male - 0.2787 * skull_width + 0.5687 * total_length - 1.8057 * tail_length

  1. Suppose we see a brushtail possum at a zoo in the US, and a sign says the possum had been captured in the wild in Australia, but it doesn’t say which part of Australia. However, the sign does indicate that the possum is male, its skull is about 63 mm wide, its tail is 37 cm long, and its total length is 83 cm. What is the reduced model’s computed probability that this possum is from Victoria? How confident are you in the model’s accuracy of this probability calculation?
sex_male <- 1
skull_width <- 63
tail_length <- 37
total_length <- 83

log = 33.5095 - 1.4207 * sex_male - 0.2787 * skull_width + 0.5687 * total_length - 1.8057 * tail_length
log
## [1] -5.0781
p <- exp(log) / (1 + exp(log))
p
## [1] 0.006193144

The probability is very low (.0062) that the possum is from Victoria. However, the possums in the random sample were in the wild and we don’t know the age at which this possum was captured. If it was young, its growth may have been impacted by life in captivity, which would make our model a poor predictor. Or maybe something about this possum made it especially easy to capture or more desirable to the trapper. Because of the very low probability, I would still tend to believe that the possum wasn’t from Victoria, but there are additional factors to consider in this case.