A researcher was interested to see whether the bird density at a particular site can be explained by the total amount of foliage at that site. Data was collected at a random sample of 17 different California oak woodland sites in spring, during the bird breeding season.
The data is stored in the file Birds.csv and contains the variables:
| Variable | Description |
|---|---|
| Foliage | An approximate measure of the total amount of foliage at a site. The units are called f.p. units (foliage profile units). The higher the f.p., the greater the amount of foliage. |
| Density | A measure of the bird population density. It is simply the number of pairs of birds per hectare. |
We wish to investigate the relationship between bird density and the total amount of foliage at California oak woodland sites.
Birds.df=read.csv("Birds.csv", header=T)
plot(Density~Foliage, main="Bird density versus Amount of Foliage",data=Birds.df)
Birds.lm=lm(Density~Foliage,data=Birds.df)
modelcheck(Birds.lm)
summary(Birds.lm)
##
## Call:
## lm(formula = Density ~ Foliage, data = Birds.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.1379 -1.3054 0.2054 1.1470 2.9159
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.6178 1.5568 1.039 0.315
## Foliage 0.2560 0.0469 5.458 6.6e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.664 on 15 degrees of freedom
## Multiple R-squared: 0.6651, Adjusted R-squared: 0.6428
## F-statistic: 29.79 on 1 and 15 DF, p-value: 6.601e-05
confint(Birds.lm)
## 2.5 % 97.5 %
## (Intercept) -1.7003172 4.9359358
## Foliage 0.1560233 0.3559404
confint(Birds.lm)*5
## 2.5 % 97.5 %
## (Intercept) -8.5015861 24.679679
## Foliage 0.7801166 1.779702
plot(Density~Foliage, main="Bird density versus Amount of Foliage",data=Birds.df)
(abline(Birds.lm, col = "blue"))
## NULL
Since we have a linear relationship in the data, we have fitted a simple linear regression model to our data. We have a random sample of sites, so assume they are independent of each other. The residuals show patternless scatter with fairly constant variability - so no problems. The normality checks don’t show any major problems and the Cook’s plot doesn’t reveal any unduly influential points. Overall, all the model assumptions are satisfied.
Our model is:
\(Density_i=\beta_0 + \beta_1 \times Foliage_i+\epsilon_i\) where \(\epsilon_i \sim iid ~ N(0,\sigma^2)\)
Our model explains 64% of the variation in the response variable.
We are interested in whether the total amount of foliage at California oak woodland sites can be used to explain bird density.
Their is strong evidence that the total amount of foliage at California oak woodland sites can be used to explain bird density (p-value < 0.001). For a 5 unit increase in foliage profile units, density is estimated to increase by between 0.78 and 1.78 pairs of birds per hectare.
1.3 Comment on the plots
Looking at this plot, it is clear that there is a linear relationship were as Foliage increases, Density also increases. The scatter appears to be fairly constant and there do not appear to be any unusual data points.