library(s20x)
emu.df = read.table('emu.txt', header = TRUE)
summary(emu.df)
height weight
Min. :1.009 Min. : 27.02
1st Qu.:1.255 1st Qu.: 37.34
Median :1.519 Median : 45.01
Mean :1.451 Mean : 47.40
3rd Qu.:1.594 3rd Qu.: 50.83
Max. :1.831 Max. :108.86
plot(weight ~ height, data = emu.df)
abline(a=-15.33, b=43.23, col="red")
The mean emu height in this sample is 1.45 meters, and the mean weight is approximately 47.4 kilograms. There is a moderately strong linear relationship between the height and the weight of an emu, and the scatter is relatively constant. There is one unusual value, with a weight of over 100 kilograms, and a height of about 1.6 meters.
emu.fit = lm(weight ~ height, data = emu.df)
plot(emu.fit, which=1)
modcheck(emu.fit)
summary(emu.fit)
Call:
lm(formula = weight ~ height, data = emu.df)
Residuals:
Min 1Q Median 3Q Max
-12.022 -8.309 -3.633 3.284 53.970
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -15.33 23.05 -0.665 0.5155
height 43.23 15.69 2.756 0.0141 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 15.49 on 16 degrees of freedom
Multiple R-squared: 0.3219, Adjusted R-squared: 0.2796
F-statistic: 7.597 on 1 and 16 DF, p-value: 0.01405
confint(emu.fit)
2.5 % 97.5 %
(Intercept) -64.194057 33.53649
height 9.980616 76.48495
plot(weight ~ height, main="Emu height vs weight", data = emu.df)
abline(emu.fit)
# Check for effect of dropping observation 11
emu.fit2 <- lm(weight ~ height, data = emu.df[-11, ])
summary(emu.fit2)
Call:
lm(formula = weight ~ height, data = emu.df[-11, ])
Residuals:
Min 1Q Median 3Q Max
-10.769 -4.606 -1.195 4.491 13.390
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.386 9.907 -0.342 0.737215
height 32.741 6.786 4.825 0.000223 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.591 on 15 degrees of freedom
Multiple R-squared: 0.6082, Adjusted R-squared: 0.582
F-statistic: 23.28 on 1 and 15 DF, p-value: 0.0002227
predheight.df = data.frame(height = 1.5)
predict(emu.fit, predheight.df, interval = "prediction")
fit lwr upr
1 49.52039 15.74109 83.29969
Given the scatter is relatively consistent, a linear model was fitted. The data appears to be independently collected. The equality of variance and normality assumptions are met. While point 11 had a significance greater than 0.4 in the Cooks plot, when it was removed, the change was less than one standard error, so this point was in the final analysis as it did not have undue influence. Our final model is \(weight = \beta_0 + \beta_1 × height+ \epsilon_i\) where \(\overset{\mathrm{iid}}{\sim} N(0, \sigma^2)\).
The aim is to investigate the relationship between height and weight of emus, and estimate the weight of a specific emu. We found that there is a positive correlation between the height and weight of an emu. We carried out a predication for the weight of an emu with the height of 1.5 meters, and it gave us a range of 15.7 to 83.3 kilograms. Our model can explain 31% of the variation.