We will use bird data originally utilized in Loyn (1987) in this exercise.
Loyn RH (1987) Effects of patch area and habitat on bird abundances, species numbers and tree health in fragmented Victorian forests. In: Nature Conservation: the role of remnants of native vegetation (Saunders DA, Arnold GW, Burbidge AA, Hopkins AJM, eds), Surrey Beatty & Sons, Chipping Norton, NSW, pp. 65-77
Data include forest bird densities measured in 56 forest patches in south-eastern Victoria, Australia. The research question was to relate bird densities to six habitat variables, including size of the forest patch, distance to the nearest patch, distance to the nearest larger patch, mean altitude of the patch, year of isolation by clearing, and an index of stock grazing history (1 = light, 5 = intensive).
We start by importing data into R.
setwd("C:/Liem/GEOG515/Spring15/Labs/")
# Read in data
birddata <- read.table("Loyn_birdsData.txt", header=TRUE, sep="\t")
birddata
## Site ABUND AREA DIST LDIST YR.ISOL GRAZE ALT
## 1 1 5.3 0.1 39 39 1968 2 160
## 2 2 2.0 0.5 234 234 1920 5 60
## 3 3 1.5 0.5 104 311 1900 5 140
## 4 4 17.1 1.0 66 66 1966 3 160
## 5 5 13.8 1.0 246 246 1918 5 140
## 6 6 14.1 1.0 234 285 1965 3 130
## 7 7 3.8 1.0 467 467 1955 5 90
## 8 8 2.2 1.0 284 1829 1920 5 60
## 9 9 3.3 1.0 156 156 1965 4 130
## 10 10 3.0 1.0 311 571 1900 5 130
## 11 11 27.6 2.0 66 332 1926 3 210
## 12 12 1.8 2.0 93 93 1890 5 160
## 13 13 21.2 2.0 39 39 1973 2 210
## 14 14 14.6 2.0 402 402 1972 1 210
## 15 15 8.0 2.0 259 259 1900 5 120
## 16 16 3.5 2.0 130 623 1900 5 145
## 17 17 29.0 3.0 26 26 1962 3 110
## 18 18 2.9 3.0 26 26 1965 3 140
## 19 19 24.3 4.0 40 40 1960 3 190
## 20 20 19.4 4.0 259 259 1953 2 90
## 21 21 24.4 4.0 234 519 1973 2 220
## 22 22 5.0 4.0 26 2205 1923 5 120
## 23 23 15.8 5.0 39 39 1965 3 130
## 24 24 25.3 5.0 372 372 1967 1 100
## 25 25 19.5 6.0 93 226 1890 3 170
## 26 26 20.9 6.0 159 1009 1960 3 150
## 27 27 16.3 7.0 285 882 1965 3 130
## 28 28 18.8 7.0 133 133 1960 4 210
## 29 29 19.9 8.0 266 266 1973 4 120
## 30 30 13.0 9.0 350 1868 1910 5 90
## 31 31 6.8 10.0 337 519 1962 3 110
## 32 32 21.7 11.0 27 27 1960 4 175
## 33 33 27.8 12.0 159 159 1963 4 110
## 34 34 26.8 12.0 133 133 1960 4 200
## 35 35 16.6 13.0 146 146 1968 2 190
## 36 36 30.4 15.0 146 398 1966 3 120
## 37 37 11.5 17.0 389 2595 1920 5 100
## 38 38 26.0 18.0 40 3188 1966 2 190
## 39 39 25.7 19.0 266 1302 1973 4 150
## 40 40 12.7 22.0 311 2672 1918 5 90
## 41 41 23.5 26.0 597 597 1963 1 140
## 42 42 24.9 29.0 545 770 1965 3 130
## 43 43 29.0 32.0 208 208 1974 1 190
## 44 44 28.3 34.0 66 345 1965 1 110
## 45 45 28.3 40.0 285 178 1962 2 120
## 46 46 32.0 44.0 93 93 1960 3 190
## 47 47 37.7 48.0 259 1297 1928 3 120
## 48 48 39.6 49.0 1427 1557 1972 1 180
## 49 49 29.6 50.0 398 1461 1967 1 140
## 50 50 31.0 57.0 467 467 1963 1 165
## 51 51 34.4 96.0 39 519 1976 2 175
## 52 52 27.3 108.0 402 4426 1964 1 70
## 53 53 30.5 134.0 467 2213 1964 1 160
## 54 54 33.0 144.0 146 319 1940 1 190
## 55 55 29.5 973.0 337 1323 1970 1 190
## 56 56 30.9 1771.0 332 332 1933 1 260
# Create a copy of birddata
birddata2 <- birddata
Now we do some data exploration before running a regression model on the dataset.
# The function factor is used to encode a vector as a factor (i.e., 'category' or 'enumerated type')
birddata2$fGRAZE <- factor(birddata2$GRAZE)
# The par( ) function is used to create a matrix of nrows x ncols plots.
# Option mfrow=c(nrows, ncols) fills in the matrix by rows while mfcol fills in by columns.
par(mfrow = c(4, 2), mar = c(2, 2, 2, 1))
dotchart(birddata2$ABUND, main = "ABUND", group = birddata2$fGRAZE)
dotchart(birddata2$AREA, main = "AREA", group = birddata2$fGRAZE)
dotchart(birddata2$DIST, main = "DIST", group = birddata2$fGRAZE)
dotchart(birddata2$LDIST, main = "LDIST", group = birddata2$fGRAZE)
dotchart(birddata2$YR.ISOL, main = "YR.ISOL", group = birddata2$fGRAZE)
dotchart(birddata2$ALT, main = "ALT", group = birddata2$fGRAZE)
dotchart(birddata2$GRAZE, main = "GRAZE", group = birddata2$fGRAZE)
plot(0, 0, type = "n", axes = FALSE) # this is optional. You only need it if you want to have an empty plot somewhere in the middle.
# Note that you can mix and match different types of graph in a comvined "par" graph
Now we make histograms. This time we use a for loop.
names<- names(birddata2)
pokemon2 <- par(mfrow = c(4, 2), mar = c(2, 2, 2, 1))
for(name in names[2:8])
{
hist(birddata2[,name],plot=TRUE, prob = TRUE, breaks=10, border="gray",
xlab=name,main=paste("Histogram of ",name))
}
As AREA, DIST, and LDIST variables show outliers via the dotplots and histograms, we will do a log transformation on these variables.
birddata2$L.AREA <- log10(birddata2$AREA)
birddata2$L.DIST <- log10(birddata2$DIST)
birddata2$L.LDIST <- log10(birddata2$LDIST)
birddata2
## Site ABUND AREA DIST LDIST YR.ISOL GRAZE ALT fGRAZE L.AREA
## 1 1 5.3 0.1 39 39 1968 2 160 2 -1.0000000
## 2 2 2.0 0.5 234 234 1920 5 60 5 -0.3010300
## 3 3 1.5 0.5 104 311 1900 5 140 5 -0.3010300
## 4 4 17.1 1.0 66 66 1966 3 160 3 0.0000000
## 5 5 13.8 1.0 246 246 1918 5 140 5 0.0000000
## 6 6 14.1 1.0 234 285 1965 3 130 3 0.0000000
## 7 7 3.8 1.0 467 467 1955 5 90 5 0.0000000
## 8 8 2.2 1.0 284 1829 1920 5 60 5 0.0000000
## 9 9 3.3 1.0 156 156 1965 4 130 4 0.0000000
## 10 10 3.0 1.0 311 571 1900 5 130 5 0.0000000
## 11 11 27.6 2.0 66 332 1926 3 210 3 0.3010300
## 12 12 1.8 2.0 93 93 1890 5 160 5 0.3010300
## 13 13 21.2 2.0 39 39 1973 2 210 2 0.3010300
## 14 14 14.6 2.0 402 402 1972 1 210 1 0.3010300
## 15 15 8.0 2.0 259 259 1900 5 120 5 0.3010300
## 16 16 3.5 2.0 130 623 1900 5 145 5 0.3010300
## 17 17 29.0 3.0 26 26 1962 3 110 3 0.4771213
## 18 18 2.9 3.0 26 26 1965 3 140 3 0.4771213
## 19 19 24.3 4.0 40 40 1960 3 190 3 0.6020600
## 20 20 19.4 4.0 259 259 1953 2 90 2 0.6020600
## 21 21 24.4 4.0 234 519 1973 2 220 2 0.6020600
## 22 22 5.0 4.0 26 2205 1923 5 120 5 0.6020600
## 23 23 15.8 5.0 39 39 1965 3 130 3 0.6989700
## 24 24 25.3 5.0 372 372 1967 1 100 1 0.6989700
## 25 25 19.5 6.0 93 226 1890 3 170 3 0.7781513
## 26 26 20.9 6.0 159 1009 1960 3 150 3 0.7781513
## 27 27 16.3 7.0 285 882 1965 3 130 3 0.8450980
## 28 28 18.8 7.0 133 133 1960 4 210 4 0.8450980
## 29 29 19.9 8.0 266 266 1973 4 120 4 0.9030900
## 30 30 13.0 9.0 350 1868 1910 5 90 5 0.9542425
## 31 31 6.8 10.0 337 519 1962 3 110 3 1.0000000
## 32 32 21.7 11.0 27 27 1960 4 175 4 1.0413927
## 33 33 27.8 12.0 159 159 1963 4 110 4 1.0791812
## 34 34 26.8 12.0 133 133 1960 4 200 4 1.0791812
## 35 35 16.6 13.0 146 146 1968 2 190 2 1.1139434
## 36 36 30.4 15.0 146 398 1966 3 120 3 1.1760913
## 37 37 11.5 17.0 389 2595 1920 5 100 5 1.2304489
## 38 38 26.0 18.0 40 3188 1966 2 190 2 1.2552725
## 39 39 25.7 19.0 266 1302 1973 4 150 4 1.2787536
## 40 40 12.7 22.0 311 2672 1918 5 90 5 1.3424227
## 41 41 23.5 26.0 597 597 1963 1 140 1 1.4149733
## 42 42 24.9 29.0 545 770 1965 3 130 3 1.4623980
## 43 43 29.0 32.0 208 208 1974 1 190 1 1.5051500
## 44 44 28.3 34.0 66 345 1965 1 110 1 1.5314789
## 45 45 28.3 40.0 285 178 1962 2 120 2 1.6020600
## 46 46 32.0 44.0 93 93 1960 3 190 3 1.6434527
## 47 47 37.7 48.0 259 1297 1928 3 120 3 1.6812412
## 48 48 39.6 49.0 1427 1557 1972 1 180 1 1.6901961
## 49 49 29.6 50.0 398 1461 1967 1 140 1 1.6989700
## 50 50 31.0 57.0 467 467 1963 1 165 1 1.7558749
## 51 51 34.4 96.0 39 519 1976 2 175 2 1.9822712
## 52 52 27.3 108.0 402 4426 1964 1 70 1 2.0334238
## 53 53 30.5 134.0 467 2213 1964 1 160 1 2.1271048
## 54 54 33.0 144.0 146 319 1940 1 190 1 2.1583625
## 55 55 29.5 973.0 337 1323 1970 1 190 1 2.9881128
## 56 56 30.9 1771.0 332 332 1933 1 260 1 3.2482186
## L.DIST L.LDIST
## 1 1.591065 1.591065
## 2 2.369216 2.369216
## 3 2.017033 2.492760
## 4 1.819544 1.819544
## 5 2.390935 2.390935
## 6 2.369216 2.454845
## 7 2.669317 2.669317
## 8 2.453318 3.262214
## 9 2.193125 2.193125
## 10 2.492760 2.756636
## 11 1.819544 2.521138
## 12 1.968483 1.968483
## 13 1.591065 1.591065
## 14 2.604226 2.604226
## 15 2.413300 2.413300
## 16 2.113943 2.794488
## 17 1.414973 1.414973
## 18 1.414973 1.414973
## 19 1.602060 1.602060
## 20 2.413300 2.413300
## 21 2.369216 2.715167
## 22 1.414973 3.343409
## 23 1.591065 1.591065
## 24 2.570543 2.570543
## 25 1.968483 2.354108
## 26 2.201397 3.003891
## 27 2.454845 2.945469
## 28 2.123852 2.123852
## 29 2.424882 2.424882
## 30 2.544068 3.271377
## 31 2.527630 2.715167
## 32 1.431364 1.431364
## 33 2.201397 2.201397
## 34 2.123852 2.123852
## 35 2.164353 2.164353
## 36 2.164353 2.599883
## 37 2.589950 3.414137
## 38 1.602060 3.503518
## 39 2.424882 3.114611
## 40 2.492760 3.426836
## 41 2.775974 2.775974
## 42 2.736397 2.886491
## 43 2.318063 2.318063
## 44 1.819544 2.537819
## 45 2.454845 2.250420
## 46 1.968483 1.968483
## 47 2.413300 3.112940
## 48 3.154424 3.192289
## 49 2.599883 3.164650
## 50 2.669317 2.669317
## 51 1.591065 2.715167
## 52 2.604226 3.646011
## 53 2.669317 3.344981
## 54 2.164353 2.503791
## 55 2.527630 3.121560
## 56 2.521138 2.521138
Now we check for multicollinearity via pairwise scatterplots and variance inflation factors (VIF)
# First we bring in the lattice library
library(lattice)
# Then create pairwise scatterplots with the splom function
splom(birddata2[c(2,6,7,10:12)], main="Loyn's Bird Data")
to test for multicollinearity we use the vif function if the car package. We need to run a regression model then nest it within the vif function which is in the car package
library(car)
vif(lm(ABUND ~ AREA + DIST+LDIST+YR.ISOL+ALT, data=birddata2))
## AREA DIST LDIST YR.ISOL ALT
## 1.250418 1.161419 1.236252 1.099856 1.434811
vif(lm(ABUND ~ L.AREA + L.DIST+L.LDIST+YR.ISOL+ALT, data=birddata2))
## L.AREA L.DIST L.LDIST YR.ISOL ALT
## 1.622200 1.622396 2.008157 1.201719 1.347805
Now we start with a regression model with no interaction.
Model1 <- lm(ABUND ~ L.AREA + L.DIST + L.LDIST + YR.ISOL + ALT + fGRAZE, data = birddata2)
summary(Model1)
##
## Call:
## lm(formula = ABUND ~ L.AREA + L.DIST + L.LDIST + YR.ISOL + ALT +
## fGRAZE, data = birddata2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.8992 -2.7245 -0.2772 2.7052 11.2811
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.68025 115.16348 0.319 0.7515
## L.AREA 6.83303 1.50330 4.545 3.97e-05 ***
## L.DIST 0.33286 2.74778 0.121 0.9041
## L.LDIST 0.79765 2.13759 0.373 0.7107
## YR.ISOL -0.01277 0.05803 -0.220 0.8267
## ALT 0.01070 0.02390 0.448 0.6565
## fGRAZE2 0.52851 3.25221 0.163 0.8716
## fGRAZE3 0.06601 2.95871 0.022 0.9823
## fGRAZE4 -1.24877 3.19838 -0.390 0.6980
## fGRAZE5 -12.47309 4.77827 -2.610 0.0122 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.105 on 46 degrees of freedom
## Multiple R-squared: 0.7295, Adjusted R-squared: 0.6766
## F-statistic: 13.78 on 9 and 46 DF, p-value: 2.115e-10
The function drop1 drops one variable at a time and compare the drop-1 model with the original one.
drop1(Model1, test="F")
## Single term deletions
##
## Model:
## ABUND ~ L.AREA + L.DIST + L.LDIST + YR.ISOL + ALT + fGRAZE
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 1714.4 211.60
## L.AREA 1 770.01 2484.4 230.38 20.6603 3.97e-05 ***
## L.DIST 1 0.55 1715.0 209.62 0.0147 0.90411
## L.LDIST 1 5.19 1719.6 209.77 0.1392 0.71075
## YR.ISOL 1 1.81 1716.2 209.66 0.0485 0.82675
## ALT 1 7.47 1721.9 209.85 0.2004 0.65650
## fGRAZE 4 413.50 2127.9 215.70 2.7736 0.03799 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
And now the ANOVA test
anova(Model1)
## Analysis of Variance Table
##
## Response: ABUND
## Df Sum Sq Mean Sq F value Pr(>F)
## L.AREA 1 3471.0 3471.0 93.1303 1.247e-12 ***
## L.DIST 1 65.5 65.5 1.7568 0.191565
## L.LDIST 1 136.5 136.5 3.6630 0.061868 .
## YR.ISOL 1 458.8 458.8 12.3109 0.001019 **
## ALT 1 78.2 78.2 2.0979 0.154281
## fGRAZE 4 413.5 103.4 2.7736 0.037992 *
## Residuals 46 1714.4 37.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
What happens if we change the order of the indepedent variables? Let’s try.
Model2 <- lm(ABUND ~ L.DIST + L.LDIST + YR.ISOL + ALT + fGRAZE + L.AREA , data = birddata2)
anova(Model2)
## Analysis of Variance Table
##
## Response: ABUND
## Df Sum Sq Mean Sq F value Pr(>F)
## L.DIST 1 101.78 101.78 2.7309 0.1052
## L.LDIST 1 17.26 17.26 0.4632 0.4995
## YR.ISOL 1 1746.94 1746.94 46.8723 1.550e-08 ***
## ALT 1 730.18 730.18 19.5914 5.853e-05 ***
## fGRAZE 4 1257.32 314.33 8.4338 3.436e-05 ***
## L.AREA 1 770.01 770.01 20.6603 3.970e-05 ***
## Residuals 46 1714.43 37.27
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note that the anova function can also be used to compare models that are nested. See the example below.
Model2 <- lm(ABUND ~ L.DIST + L.LDIST + YR.ISOL + ALT + fGRAZE + L.AREA , data = birddata2)
Model3 <- lm(ABUND ~ L.DIST + L.LDIST + YR.ISOL + ALT + fGRAZE , data = birddata2)
anova(Model2, Model3)
## Analysis of Variance Table
##
## Model 1: ABUND ~ L.DIST + L.LDIST + YR.ISOL + ALT + fGRAZE + L.AREA
## Model 2: ABUND ~ L.DIST + L.LDIST + YR.ISOL + ALT + fGRAZE
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 46 1714.4
## 2 47 2484.4 -1 -770.01 20.66 3.97e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Other useful functions to explore a regression model
# model coefficients
coefficients(Model1)
## (Intercept) L.AREA L.DIST L.LDIST YR.ISOL
## 36.68024697 6.83302533 0.33285894 0.79764733 -0.01277482
## ALT fGRAZE2 fGRAZE3 fGRAZE4 fGRAZE5
## 0.01069974 0.52851030 0.06600516 -1.24877258 -12.47309154
# Confident intervals for model parameters
confint(Model1, level=0.95)
## 2.5 % 97.5 %
## (Intercept) -195.13181426 268.49230820
## L.AREA 3.80704567 9.85900498
## L.DIST -5.19812812 5.86384601
## L.LDIST -3.50510535 5.10040001
## YR.ISOL -0.12959311 0.10404348
## ALT -0.03741179 0.05881127
## fGRAZE2 -6.01784196 7.07486256
## fGRAZE3 -5.88956125 6.02157156
## fGRAZE4 -7.68677417 5.18922901
## fGRAZE5 -22.09124184 -2.85494124
# predicted values
fitted(Model1)
## 1 2 3 4 5 6
## 8.7455613 0.9429609 2.0357546 15.3999285 3.9059894 15.7814203
## 7 8 9 10 11 12
## 3.2130463 3.7401981 14.1992685 4.3545325 19.0624786 6.0570399
## 13 14 15 16 17 18
## 18.1066454 18.7362953 6.0041701 6.4760741 17.7188526 18.0015204
## 19 20 21 22 23 24
## 19.6655939 20.0646600 21.4262406 8.1768919 19.6094934 20.3042527
## 25 26 27 28 29 30
## 22.2709088 21.7585018 21.9758437 20.8153849 20.4229121 10.7468176
## 31 32 33 34 35 36
## 22.6991502 20.9993183 21.3942482 22.3078839 24.1592881 23.7454082
## 37 38 39 40 41 42
## 12.7425369 26.0315609 23.8609842 13.4039877 25.9080522 26.2405433
## 43 44 45 46 47 48
## 26.4010243 25.8492754 26.9876123 27.1956999 28.1746565 28.5597148
## 49 50 51 52 53 54
## 28.0489220 28.3843575 30.0784314 30.0089974 31.3936496 31.3957667
## 55 56
## 37.2959093 39.8137829
# residuals
residuals(Model1)
## 1 2 3 4 5
## -3.44556134 1.05703912 -0.53575459 1.70007154 9.89401063
## 6 7 8 9 10
## -1.68142026 0.58695369 -1.54019807 -10.89926850 -1.35453250
## 11 12 13 14 15
## 8.53752140 -4.25703987 3.09335463 -4.13629526 1.99582986
## 16 17 18 19 20
## -2.97607410 11.28114745 -15.10152043 4.63440611 -0.66465999
## 21 22 23 24 25
## 2.97375944 -3.17689191 -3.80949338 4.99574733 -2.77090876
## 26 27 28 29 30
## -0.85850178 -5.67584365 -2.01538491 -0.52291207 2.25318238
## 31 32 33 34 35
## -15.89915023 0.70068175 6.40575182 4.49211606 -7.55928808
## 36 37 38 39 40
## 6.65459178 -1.24253691 -0.03156094 1.83901585 -0.70398774
## 41 42 43 44 45
## -2.40805216 -1.34054335 2.59897567 2.45072459 1.31238766
## 46 47 48 49 50
## 4.80430007 9.52534350 11.04028520 1.55107803 2.61564252
## 51 52 53 54 55
## 4.32156861 -2.70899736 -0.89364955 1.60423326 -7.79590934
## 56
## -8.91378293
# covariance matrix for model parameters
vcov(Model1)
## (Intercept) L.AREA L.DIST L.LDIST
## (Intercept) 13262.6270322 -19.816772116 -0.536247386 -2.482103163
## L.AREA -19.8167721 2.259901487 0.172215155 -1.437004670
## L.DIST -0.5362474 0.172215155 7.550275251 -2.653045631
## L.LDIST -2.4821032 -1.437004670 -2.653045631 4.569306245
## YR.ISOL -6.6654990 0.010434299 -0.006506969 -0.001485992
## ALT -0.8290191 -0.006403749 0.008521705 0.010483738
## fGRAZE2 14.2581387 1.611932956 3.147871521 -0.797950611
## fGRAZE3 -96.8958777 1.516804020 2.617627346 -0.095280348
## fGRAZE4 -11.8844816 1.144354235 1.656256020 0.504648847
## fGRAZE5 -401.8443547 3.439860553 2.061439542 -2.096983787
## YR.ISOL ALT fGRAZE2 fGRAZE3
## (Intercept) -6.6654989839 -0.8290190965 14.258138666 -96.89587767
## L.AREA 0.0104342993 -0.0064037488 1.611932956 1.51680402
## L.DIST -0.0065069694 0.0085217046 3.147871521 2.61762735
## L.LDIST -0.0014859922 0.0104837382 -0.797950611 -0.09528035
## YR.ISOL 0.0033680556 0.0003548699 -0.012908148 0.04196216
## ALT 0.0003548699 0.0005712890 -0.002366617 0.01604805
## fGRAZE2 -0.0129081481 -0.0023666173 10.576847486 5.46120395
## fGRAZE3 0.0419621626 0.0160480525 5.461203950 8.75394155
## fGRAZE4 0.0001441308 0.0065299361 5.109685619 5.21858702
## fGRAZE5 0.1972738220 0.0395840712 5.151353987 8.54687330
## fGRAZE4 fGRAZE5
## (Intercept) -1.188448e+01 -401.84435473
## L.AREA 1.144354e+00 3.43986055
## L.DIST 1.656256e+00 2.06143954
## L.LDIST 5.046488e-01 -2.09698379
## YR.ISOL 1.441308e-04 0.19727382
## ALT 6.529936e-03 0.03958407
## fGRAZE2 5.109686e+00 5.15135399
## fGRAZE3 5.218587e+00 8.54687330
## fGRAZE4 1.022962e+01 5.25994277
## fGRAZE5 5.259943e+00 22.83182455
You also can checks for heteroscedasticity, normality, and influential observerations via the diagnostic plots. Six plots (selectable by which) are currently available: a plot of residuals against fitted values, a Scale-Location plot of sqrt(| residuals |) against fitted values, a Normal Q-Q plot, a plot of Cook’s distances versus row labels, a plot of residuals against leverages, and a plot of Cook’s distances against leverage/(1-leverage). By default, the first three and 5 are provided.
par(mfrow = c(3, 2), mar = c(2, 2, 2, 1))
plot(Model1, which=c(1:6))
How to deal with insignificant indepedent variables
step(Model1, direction="backward")
## Start: AIC=211.6
## ABUND ~ L.AREA + L.DIST + L.LDIST + YR.ISOL + ALT + fGRAZE
##
## Df Sum of Sq RSS AIC
## - L.DIST 1 0.55 1715.0 209.62
## - YR.ISOL 1 1.81 1716.2 209.66
## - L.LDIST 1 5.19 1719.6 209.77
## - ALT 1 7.47 1721.9 209.85
## <none> 1714.4 211.60
## - fGRAZE 4 413.50 2127.9 215.70
## - L.AREA 1 770.01 2484.4 230.38
##
## Step: AIC=209.62
## ABUND ~ L.AREA + L.LDIST + YR.ISOL + ALT + fGRAZE
##
## Df Sum of Sq RSS AIC
## - YR.ISOL 1 1.73 1716.7 207.68
## - ALT 1 7.07 1722.0 207.85
## - L.LDIST 1 8.57 1723.5 207.90
## <none> 1715.0 209.62
## - fGRAZE 4 413.28 2128.2 213.71
## - L.AREA 1 769.64 2484.6 228.38
##
## Step: AIC=207.68
## ABUND ~ L.AREA + L.LDIST + ALT + fGRAZE
##
## Df Sum of Sq RSS AIC
## - L.LDIST 1 8.32 1725.0 205.95
## - ALT 1 9.71 1726.4 205.99
## <none> 1716.7 207.68
## - fGRAZE 4 848.77 2565.5 222.18
## - L.AREA 1 790.20 2506.9 226.88
##
## Step: AIC=205.95
## ABUND ~ L.AREA + ALT + fGRAZE
##
## Df Sum of Sq RSS AIC
## - ALT 1 5.37 1730.4 204.12
## <none> 1725.0 205.95
## - fGRAZE 4 914.23 2639.3 221.76
## - L.AREA 1 1130.78 2855.8 232.18
##
## Step: AIC=204.12
## ABUND ~ L.AREA + fGRAZE
##
## Df Sum of Sq RSS AIC
## <none> 1730.4 204.12
## - fGRAZE 4 1136.5 2866.9 224.40
## - L.AREA 1 1153.8 2884.2 230.73
##
## Call:
## lm(formula = ABUND ~ L.AREA + fGRAZE, data = birddata2)
##
## Coefficients:
## (Intercept) L.AREA fGRAZE2 fGRAZE3 fGRAZE4
## 15.7164 7.2472 0.3826 -0.1893 -1.5916
## fGRAZE5
## -11.8938
step(Model1, direction="both")
## Start: AIC=211.6
## ABUND ~ L.AREA + L.DIST + L.LDIST + YR.ISOL + ALT + fGRAZE
##
## Df Sum of Sq RSS AIC
## - L.DIST 1 0.55 1715.0 209.62
## - YR.ISOL 1 1.81 1716.2 209.66
## - L.LDIST 1 5.19 1719.6 209.77
## - ALT 1 7.47 1721.9 209.85
## <none> 1714.4 211.60
## - fGRAZE 4 413.50 2127.9 215.70
## - L.AREA 1 770.01 2484.4 230.38
##
## Step: AIC=209.62
## ABUND ~ L.AREA + L.LDIST + YR.ISOL + ALT + fGRAZE
##
## Df Sum of Sq RSS AIC
## - YR.ISOL 1 1.73 1716.7 207.68
## - ALT 1 7.07 1722.0 207.85
## - L.LDIST 1 8.57 1723.5 207.90
## <none> 1715.0 209.62
## + L.DIST 1 0.55 1714.4 211.60
## - fGRAZE 4 413.28 2128.2 213.71
## - L.AREA 1 769.64 2484.6 228.38
##
## Step: AIC=207.68
## ABUND ~ L.AREA + L.LDIST + ALT + fGRAZE
##
## Df Sum of Sq RSS AIC
## - L.LDIST 1 8.32 1725.0 205.95
## - ALT 1 9.71 1726.4 205.99
## <none> 1716.7 207.68
## + YR.ISOL 1 1.73 1715.0 209.62
## + L.DIST 1 0.47 1716.2 209.66
## - fGRAZE 4 848.77 2565.5 222.18
## - L.AREA 1 790.20 2506.9 226.88
##
## Step: AIC=205.95
## ABUND ~ L.AREA + ALT + fGRAZE
##
## Df Sum of Sq RSS AIC
## - ALT 1 5.37 1730.4 204.12
## <none> 1725.0 205.95
## + L.LDIST 1 8.32 1716.7 207.68
## + L.DIST 1 3.67 1721.3 207.83
## + YR.ISOL 1 1.48 1723.5 207.90
## - fGRAZE 4 914.23 2639.3 221.76
## - L.AREA 1 1130.78 2855.8 232.18
##
## Step: AIC=204.12
## ABUND ~ L.AREA + fGRAZE
##
## Df Sum of Sq RSS AIC
## <none> 1730.4 204.12
## + ALT 1 5.37 1725.0 205.95
## + L.LDIST 1 3.98 1726.4 205.99
## + YR.ISOL 1 3.35 1727.0 206.01
## + L.DIST 1 1.43 1729.0 206.08
## - fGRAZE 4 1136.54 2866.9 224.40
## - L.AREA 1 1153.85 2884.2 230.73
##
## Call:
## lm(formula = ABUND ~ L.AREA + fGRAZE, data = birddata2)
##
## Coefficients:
## (Intercept) L.AREA fGRAZE2 fGRAZE3 fGRAZE4
## 15.7164 7.2472 0.3826 -0.1893 -1.5916
## fGRAZE5
## -11.8938
null<-lm(ABUND ~1, data=birddata2)
summary(null)
##
## Call:
## lm(formula = ABUND ~ 1, data = birddata2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.014 -7.114 1.536 8.786 20.086
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 19.514 1.434 13.6 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.73 on 55 degrees of freedom
step(null, scope=list(lower=null,upper=Model1), direction="forward")
## Start: AIC=266.82
## ABUND ~ 1
##
## Df Sum of Sq RSS AIC
## + L.AREA 1 3471.0 2866.9 224.40
## + fGRAZE 4 3453.7 2884.2 230.73
## + YR.ISOL 1 1605.8 4732.1 252.46
## + ALT 1 943.5 5394.4 259.80
## <none> 6337.9 266.82
## + L.DIST 1 101.8 6236.1 267.92
## + L.LDIST 1 88.4 6249.5 268.04
##
## Step: AIC=224.4
## ABUND ~ L.AREA
##
## Df Sum of Sq RSS AIC
## + fGRAZE 4 1136.54 1730.4 204.12
## + YR.ISOL 1 607.35 2259.6 213.06
## + ALT 1 227.68 2639.3 221.76
## + L.LDIST 1 201.93 2665.0 222.31
## <none> 2866.9 224.40
## + L.DIST 1 65.48 2801.5 225.10
##
## Step: AIC=204.12
## ABUND ~ L.AREA + fGRAZE
##
## Df Sum of Sq RSS AIC
## <none> 1730.4 204.12
## + ALT 1 5.3723 1725.0 205.95
## + L.LDIST 1 3.9829 1726.4 205.99
## + YR.ISOL 1 3.3474 1727.0 206.01
## + L.DIST 1 1.4276 1729.0 206.08
##
## Call:
## lm(formula = ABUND ~ L.AREA + fGRAZE, data = birddata2)
##
## Coefficients:
## (Intercept) L.AREA fGRAZE2 fGRAZE3 fGRAZE4
## 15.7164 7.2472 0.3826 -0.1893 -1.5916
## fGRAZE5
## -11.8938