1 Question 1

A researcher was interested to see whether the bird density at a particular site can be explained by the total amount of foliage at that site. Data was collected at a random sample of 17 different California oak woodland sites in spring, during the bird breeding season.

The data is stored in the file Birds.csv and contains the variables:

Variable Description
Foliage An approximate measure of the total amount of foliage at a site. The units are called f.p. units (foliage profile units). The higher the f.p., the greater the amount of foliage.
Density A measure of the bird population density. It is simply the number of pairs of birds per hectare.

1.1 Question of interest/goal of the study

We wish to investigate the relationship between bird density the total amount of foliage at California oak woodland sites.

1.2 Read in and inspect the data:

Birds.df=read.csv("Birds.csv", header=T)
plot(Density~Foliage, main="Bird density versus Amount of Foliage",data=Birds.df)

1.3 Comment on the plots

{The plot shows the positive association between the Foliage and Density. The amoubt of foliage increases. the bird density also increase}

1.4 Fit an appropriate linear model, including model checks and relevant output.

Birds.lm=lm(Density~Foliage,data=Birds.df)
modelcheck(Birds.lm)

summary(Birds.lm)
## 
## Call:
## lm(formula = Density ~ Foliage, data = Birds.df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1379 -1.3054  0.2054  1.1470  2.9159 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   1.6178     1.5568   1.039    0.315    
## Foliage       0.2560     0.0469   5.458  6.6e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.664 on 15 degrees of freedom
## Multiple R-squared:  0.6651, Adjusted R-squared:  0.6428 
## F-statistic: 29.79 on 1 and 15 DF,  p-value: 6.601e-05
confint(Birds.lm)
##                  2.5 %    97.5 %
## (Intercept) -1.7003172 4.9359358
## Foliage      0.1560233 0.3559404

1.5 Create a scatter plot with the fitted line from your model superimposed over it.

plot(Density~Foliage, main="Bird density versus Amount of Foliage",data=Birds.df)

# Add some code here
trendscatter(Density~Foliage, data = Birds.df)

1.6 Method and Assumption Checks

Since we have a linear relationship in the data, we have fitted a simple linear regression model to our data. We have a random sample of sites, so assume they are independent of each other. The residuals show patternless scatter with fairly constant variability - so no problems. The normality checks don’t show any major problems and the Cook’s plot doesn’t reveal any unduly influential points. Overall, all the model assumptions are satisfied.

1.6.1 Complete the equation below:

Our model is:

\(Density_i\) = 1.6178 + 0.2560 * \(Foliage_i\) + \(\epsilon_i\) where \(\epsilon_i \sim iid ~ N(0,\sigma^2)\)

1.6.2 Complete the statement

Our model explains 66.51% of the variation in the response variable.

1.7 Executive Summary

We are interested in whether the total amount of foliage at California oak woodland sites can be used to explain bird density.

{The results show the foliage have positive infulence on birds’ density(estimate = 0.2560, p < 0.001), which means each unit increase in foliage, bird density increases on average by 0.256. It also explains approximately 66.51% of the variance in bird density (adjusted R-squared = 0.643), indicating a good fit.}