van Emmerik Quiz 2

QUESTION 1 1. Least flawed diamonds have the lowest price according to the boxplot and the very slightly imperfect diamonds have the highest average price.

QUESTION2 mod1=lm(price~clarity, diam) a. the VS2 has the highest predicted value (2694.8+3163.4=5858.2)and the internally flawless diamond has the least predicated value because it is just the intercept (2694.8) as it is the reference group. b. This is suprising because one would assume that a flawless diamond would have a higher price than a flawed diamond.

QUESTION 3 a) mod2=lm(price~clarity+carat,diam) No the intercepts are different and they also do not have the same slopes (they are negative now!) b) In holding clarity constant each additional carat increases the price of a diamond by $12226.4. c) In holding other clarities and carat constant a diamond with clarityVS2 is on average $1561 less than an internally flawed diamond.

QUESTION 4 The internally flawed diamond is the highest expected price when controlling for carat (12226.4-1851.2=$10375.2). The clarityVS2 would have the lowest predicted value (12226.4-1561.9=$8813.3)

QUESTION 5 Simpson's paradox occurs when a coefficient on an explanatory variable depends on what other variables are included in the model. In this case we see that the large carat diamonds sell at a higher price (people want big diamonds) despite the fact that they are more often internally flawed. When accounting for carat the larger diamonds have a higher expected price.

diam = read.csv("http://www.macalester.edu/~ajohns24/data/Diamonds.csv")
boxplot(price ~ clarity, diam)

plot of chunk unnamed-chunk-1

mod1 = lm(price ~ clarity, diam)
mod2 = lm(price ~ clarity + carat, diam)
table(diam$carat, diam$clarity)
##       
##        IF VS1 VS2 VVS1 VVS2
##   0.18  3   0   0    1    2
##   0.19  4   0   0    2    2
##   0.2   1   2   1    0    0
##   0.21  3   1   0    0    0
##   0.22  1   0   0    0    0
##   0.23  3   0   0    0    0
##   0.24  1   0   0    0    0
##   0.25  4   0   0    0    0
##   0.26  2   0   0    1    1
##   0.27  2   0   0    0    0
##   0.28  1   0   0    0    0
##   0.29  2   0   0    0    0
##   0.3   1   2   1    2    3
##   0.31  1   4   1    1    2
##   0.32  1   1   1    0    0
##   0.33  1   0   2    0    0
##   0.34  0   4   2    1    1
##   0.35  0   3   1    1    1
##   0.36  0   1   0    0    1
##   0.37  0   1   1    0    0
##   0.4   1   2   0    0    0
##   0.41  0   1   0    1    1
##   0.43  0   0   0    0    1
##   0.45  0   1   0    0    0
##   0.46  0   0   0    0    1
##   0.47  0   0   0    0    1
##   0.48  0   1   0    0    1
##   0.5   1   5   0    3    1
##   0.51  0   1   0    3    2
##   0.52  1   2   2    1    1
##   0.53  0   2   0    1    3
##   0.54  0   1   0    1    0
##   0.55  1   1   0    0    4
##   0.56  0   3   1    0    3
##   0.57  0   0   1    2    1
##   0.58  1   0   0    3    0
##   0.59  0   0   0    0    1
##   0.6   1   2   1    1    0
##   0.61  0   0   0    0    1
##   0.62  0   0   0    1    1
##   0.63  1   0   0    0    1
##   0.64  0   0   0    1    1
##   0.65  0   0   0    0    1
##   0.66  0   0   0    2    0
##   0.7   0   4   4    4    5
##   0.71  1   4   1    0    4
##   0.72  0   1   1    2    0
##   0.73  0   3   2    2    0
##   0.74  0   0   1    1    1
##   0.75  0   1   0    0    2
##   0.76  1   0   0    0    1
##   0.77  0   0   0    1    0
##   0.78  0   1   0    0    1
##   0.8   1   2   1    0    3
##   0.81  1   2   1    1    0
##   0.82  0   0   2    0    1
##   0.83  0   0   1    0    0
##   0.84  0   0   1    0    0
##   0.85  0   0   2    2    0
##   0.86  0   0   1    0    1
##   0.89  0   1   0    0    0
##   0.9   0   1   0    0    1
##   1     1   8   9    6    8
##   1.01  0   9   6    4    5
##   1.02  0   1   2    0    2
##   1.03  0   1   0    0    0
##   1.04  1   1   0    0    0
##   1.05  0   0   0    0    1
##   1.06  0   0   2    0    1
##   1.07  0   0   0    0    1
##   1.09  0   0   0    0    1
##   1.1   0   0   1    0    0