Lab 2 Brittany Landorf

1) The boxplot illustrates an unusual relationship between price and clarity of the diamonds, the internally flawless diamonds have the lowest price but the very slightly imperfect gems have the highest average price.

2) A) The internally flawless (IF) has the lowest predicated value at $ 2694.80 but the calristyVS2 has the highest predicted value at $5,856.20 B) These results are suprising because the more flawed diamonds have a higher estimated price, but one would assume that the IF would have the higher price.

3) A) The intercepts and the slopes are different from the previous model. The slope of the line is negative rather than positive. B) Holding clarity constant, each additional carat increases the price of the diamond by $12,226.40. C) Holding other claritys and carat constant, clarity VS2 decreases the price of a diamond by $1561.90 from an average of a interally flawless diamond (IF).

4) When controlling for carat (holding it constant), the IF has the highest expected price (12,226.40-1851.20)= 10.375.20. The lowest expected price would be the clarityVS2 (12,226.40-1851.20-1561.90)= $8813.30

5) The different relationships between price and clarity when carat is considered demonstrates that size affects the price of diamonds. So, although IF diamonds cost more they are generally smaller size. Larger size diamonds are more flawed but are more likely to have high price. As the table shows (table(diam$carat, diam$clarity)), we see that as the smaller carat sized diamonds are less likely to be flawed and the larger are more likely to be flawed. Without taking carat size into account, the average price of diamonds appears reversed.

library(mosaic)

## Loading required package: grid Loading required package: lattice
## 
## Attaching package: 'mosaic'
## 
## The following objects are masked from 'package:stats':
## 
## D, IQR, binom.test, cor, cov, fivenum, median, prop.test, sd, t.test, var
## 
## The following objects are masked from 'package:base':
## 
## max, mean, min, print, prod, range, sample, sum

diam = read.csv("http://www.macalester.edu/~ajohns24/data/Diamonds.csv")
boxplot(price ~ clarity, diam)

plot of chunk unnamed-chunk-1

mod = lm(price ~ clarity, diam)
summary(mod)

## 
## Call:
## lm(formula = price ~ clarity, data = diam)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -5220  -1940   -991   2063  11218 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     2695        494    5.45  1.0e-07 ***
## clarityVS1      2362        614    3.85  0.00015 ***
## clarityVS2      3163        668    4.73  3.4e-06 ***
## clarityVVS1     2873        671    4.28  2.5e-05 ***
## clarityVVS2     2662        618    4.31  2.2e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3280 on 303 degrees of freedom
## Multiple R-squared:  0.0843, Adjusted R-squared:  0.0722 
## F-statistic: 6.97 on 4 and 303 DF,  p-value: 2.22e-05

mod2 = lm(price ~ clarity + carat, diam)
summary(mod2)

## 
## Call:
## lm(formula = price ~ clarity + carat, data = diam)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -1962   -584    -63    435   5914 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    -1851        178  -10.43  < 2e-16 ***
## clarityVS1     -1001        203   -4.93  1.3e-06 ***
## clarityVS2     -1562        228   -6.84  4.3e-11 ***
## clarityVVS1     -404        220   -1.84    0.067 .  
## clarityVVS2     -959        206   -4.66  4.8e-06 ***
## carat          12226        232   52.66  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1030 on 302 degrees of freedom
## Multiple R-squared:  0.91,   Adjusted R-squared:  0.909 
## F-statistic:  611 on 5 and 302 DF,  p-value: <2e-16

table(diam$carat, diam$clarity)

##       
##        IF VS1 VS2 VVS1 VVS2
##   0.18  3   0   0    1    2
##   0.19  4   0   0    2    2
##   0.2   1   2   1    0    0
##   0.21  3   1   0    0    0
##   0.22  1   0   0    0    0
##   0.23  3   0   0    0    0
##   0.24  1   0   0    0    0
##   0.25  4   0   0    0    0
##   0.26  2   0   0    1    1
##   0.27  2   0   0    0    0
##   0.28  1   0   0    0    0
##   0.29  2   0   0    0    0
##   0.3   1   2   1    2    3
##   0.31  1   4   1    1    2
##   0.32  1   1   1    0    0
##   0.33  1   0   2    0    0
##   0.34  0   4   2    1    1
##   0.35  0   3   1    1    1
##   0.36  0   1   0    0    1
##   0.37  0   1   1    0    0
##   0.4   1   2   0    0    0
##   0.41  0   1   0    1    1
##   0.43  0   0   0    0    1
##   0.45  0   1   0    0    0
##   0.46  0   0   0    0    1
##   0.47  0   0   0    0    1
##   0.48  0   1   0    0    1
##   0.5   1   5   0    3    1
##   0.51  0   1   0    3    2
##   0.52  1   2   2    1    1
##   0.53  0   2   0    1    3
##   0.54  0   1   0    1    0
##   0.55  1   1   0    0    4
##   0.56  0   3   1    0    3
##   0.57  0   0   1    2    1
##   0.58  1   0   0    3    0
##   0.59  0   0   0    0    1
##   0.6   1   2   1    1    0
##   0.61  0   0   0    0    1
##   0.62  0   0   0    1    1
##   0.63  1   0   0    0    1
##   0.64  0   0   0    1    1
##   0.65  0   0   0    0    1
##   0.66  0   0   0    2    0
##   0.7   0   4   4    4    5
##   0.71  1   4   1    0    4
##   0.72  0   1   1    2    0
##   0.73  0   3   2    2    0
##   0.74  0   0   1    1    1
##   0.75  0   1   0    0    2
##   0.76  1   0   0    0    1
##   0.77  0   0   0    1    0
##   0.78  0   1   0    0    1
##   0.8   1   2   1    0    3
##   0.81  1   2   1    1    0
##   0.82  0   0   2    0    1
##   0.83  0   0   1    0    0
##   0.84  0   0   1    0    0
##   0.85  0   0   2    2    0
##   0.86  0   0   1    0    1
##   0.89  0   1   0    0    0
##   0.9   0   1   0    0    1
##   1     1   8   9    6    8
##   1.01  0   9   6    4    5
##   1.02  0   1   2    0    2
##   1.03  0   1   0    0    0
##   1.04  1   1   0    0    0
##   1.05  0   0   0    0    1
##   1.06  0   0   2    0    1
##   1.07  0   0   0    0    1
##   1.09  0   0   0    0    1
##   1.1   0   0   1    0    0