This is a set of data from Statistical Methods in Biology Second Edition that describes the relationship between altitude and tail length. The file was digitized using plot digitizer 2.6.8 and saved to a csv file.
We can use R to calculate the values according to the formula. This is simple when you carry out the operations on the vectors.
x <- data$Altitudey <- data$Tailxbar <-mean(x)ybar <-mean(y)xbar
[1] 1.37244
ybar
[1] 41.53142
x1 <- x-xbary1 <- y-ybarn <- x1*y1d <- x1^2sum(n)
[1] 127.0051
sum(d)
[1] 35.41282
b <-sum(n)/sum(d)b
[1] 3.586416
a <- ybar - b*xbara
[1] 36.60928
Alternatively you can use the built in linear model function in R which also produces a full set of diagnostic plots as well as a data object containing the residuals.
model <-lm(Tail~Altitude, data)summary(model)
Call:
lm(formula = Tail ~ Altitude, data = data)
Residuals:
Min 1Q Median 3Q Max
-9.9688 -2.4378 -0.0019 2.4307 9.3799
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 36.6093 0.9946 36.809 < 2e-16 ***
Altitude 3.5864 0.6434 5.574 4.64e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.829 on 68 degrees of freedom
Multiple R-squared: 0.3136, Adjusted R-squared: 0.3035
F-statistic: 31.07 on 1 and 68 DF, p-value: 4.64e-07
plot(model,1)
plot(model, 2)
plot(model, 3)
plot(model, 4)
plot(model, 5)
plot(model, 6)
More Diagnostic Plots
You can also calculate extra diagnostic plots of you wish. For example the residual plot and the standardised residual plot
plot(model$residuals, main="Plot of the Residuals", ylab="Residual (mm)")
res <- model$residualsm1 <-mean(res)s1 <-sd(res)std_res <- (res-m1)/s1plot(std_res, main="Plot of the Standardised Residuals", ylab="Standard Deviations")
lev <-hatvalues(model)plot(lev, main="Plot of the Leverage", ylab="Leverage")