Bat data models with Differences Divided

Set up stuff

rm(list=ls())
pacman::p_load(plyr,dplyr,ggplot2,lme4,car,multcomp,tidyr,reshape2,AICcmodavg)
setwd("C:/Users/Katy/Desktop/thesisr")

Data organization

# Read in data, add column calculating Calls Per Night
data.allparks <- read.csv("./MixedModel_allparks.csv")
data.allparks$Year <- as.factor(data.allparks$Year)
data.allparks$CallsPN <- round(data.allparks$Calls/data.allparks$Nights, 3)
str(data.allparks)

## 'data.frame':    2460 obs. of  7 variables:
##  $ Park   : Factor w/ 8 levels "APIS","GRPO",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Site   : Factor w/ 185 levels "APIS001A","APIS002A",..: 1 1 1 1 1 1 2 2 2 2 ...
##  $ Year   : Factor w/ 2 levels "2016","2017": 2 2 2 2 2 2 2 2 2 2 ...
##  $ Nights : int  12 12 12 12 12 12 9 9 9 9 ...
##  $ Species: Factor w/ 9 levels "EPFU","LABO",..: 1 2 3 4 5 6 1 2 3 4 ...
##  $ Calls  : int  130 92 95 1496 2311 1 8 2 7 47 ...
##  $ CallsPN: num  10.83 7.67 7.92 124.67 192.58 ...

head(data.allparks)

##   Park     Site Year Nights Species Calls CallsPN
## 1 APIS APIS001A 2017     12    EPFU   130  10.833
## 2 APIS APIS001A 2017     12    LABO    92   7.667
## 3 APIS APIS001A 2017     12    LACI    95   7.917
## 4 APIS APIS001A 2017     12    LANO  1496 124.667
## 5 APIS APIS001A 2017     12    MYLU  2311 192.583
## 6 APIS APIS001A 2017     12    MYSE     1   0.083

# Temporarily subset the data by year. Rename columns to include year information.
# and remove the actual "Year" column. Then merge all data back to one wide
# data frame, with rows for each combination of Park, Site, and Species. Add a new
# column showing difference between Calls Per Night in 2017 vs 2016. 
data.allparks.2016 <- data.allparks %>% 
                      subset(Year == "2016") %>%
                      rename(Nights.2016 = Nights, Calls.2016 = Calls, 
                             CallsPN.2016 = CallsPN) %>%
                      subset(select = -c(Year))

data.allparks.2017 <- data.allparks %>% 
                      subset(Year == "2017") %>%
                      rename(Nights.2017 = Nights, Calls.2017 = Calls, 
                           CallsPN.2017 = CallsPN) %>%
                      subset(select = -c(Year))

data.allparks.wide <- merge(data.allparks.2016, data.allparks.2017, 
                            by = c("Park","Site","Species")) %>%
                      mutate(Diff = CallsPN.2017 - CallsPN.2016)

Dividing Difference Data by 10 or 100

What if we try dividing all the Diff values by 10 or 100 to reduce the spread, and rerun the models?

# Edit the data frame, adding 2 new columns
data.allparks.wide <- data.allparks.wide %>%
                      mutate(Diff.Ten = Diff/10, Diff.Hun = Diff/100)
str(data.allparks.wide)

## 'data.frame':    1230 obs. of  12 variables:
##  $ Park        : Factor w/ 8 levels "APIS","GRPO",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ Site        : Factor w/ 185 levels "APIS001A","APIS002A",..: 1 1 1 1 1 1 2 2 2 2 ...
##  $ Species     : Factor w/ 9 levels "EPFU","LABO",..: 1 2 3 4 5 6 1 2 3 4 ...
##  $ Nights.2016 : int  7 7 7 7 7 7 10 10 10 10 ...
##  $ Calls.2016  : int  17 91 35 144 2380 16 4 1 4 33 ...
##  $ CallsPN.2016: num  2.43 13 5 20.57 340 ...
##  $ Nights.2017 : int  12 12 12 12 12 12 9 9 9 9 ...
##  $ Calls.2017  : int  130 92 95 1496 2311 1 8 2 7 47 ...
##  $ CallsPN.2017: num  10.83 7.67 7.92 124.67 192.58 ...
##  $ Diff        : num  8.4 -5.33 2.92 104.1 -147.42 ...
##  $ Diff.Ten    : num  0.84 -0.533 0.292 10.41 -14.742 ...
##  $ Diff.Hun    : num  0.084 -0.0533 0.0292 1.041 -1.4742 ...

head(data.allparks.wide)

##   Park     Site Species Nights.2016 Calls.2016 CallsPN.2016 Nights.2017
## 1 APIS APIS001A    EPFU           7         17        2.429          12
## 2 APIS APIS001A    LABO           7         91       13.000          12
## 3 APIS APIS001A    LACI           7         35        5.000          12
## 4 APIS APIS001A    LANO           7        144       20.571          12
## 5 APIS APIS001A    MYLU           7       2380      340.000          12
## 6 APIS APIS001A    MYSE           7         16        2.286          12
##   Calls.2017 CallsPN.2017     Diff Diff.Ten Diff.Hun
## 1        130       10.833    8.404   0.8404  0.08404
## 2         92        7.667   -5.333  -0.5333 -0.05333
## 3         95        7.917    2.917   0.2917  0.02917
## 4       1496      124.667  104.096  10.4096  1.04096
## 5       2311      192.583 -147.417 -14.7417 -1.47417
## 6          1        0.083   -2.203  -0.2203 -0.02203

Plot 1 - histogram of counts

gg.diff.ten1 <- ggplot(data.allparks.wide, aes(data.allparks.wide$Diff.Ten)) + 
                geom_histogram(binwidth = 2, colour="black", fill="gray") + 
                theme_bw()
gg.diff.hun1 <- ggplot(data.allparks.wide, aes(data.allparks.wide$Diff.Hun)) + 
                geom_histogram(binwidth = 1, colour="black", fill="gray") + 
                theme_bw()
gg.diff.ten1

gg.diff.hun1

Plot 2 - histogram of densities, density estimate, and probability density function

gg.diff.ten2 <- ggplot(data.allparks.wide, aes(x=Diff.Ten)) + 
                geom_histogram(aes(y=..density..), binwidth = 2, 
                               colour="black", fill="gray") + 
                geom_density(alpha=0.4, color= "coral2", size = .7, fill= "coral2") + 
                theme_bw() 
gg.diff.hun2 <- ggplot(data.allparks.wide, aes(x=Diff.Hun)) + 
                geom_histogram(aes(y=..density..), binwidth = 1, 
                               colour="black", fill="gray") + 
                geom_density(alpha=0.4, color= "coral2", size = .7, fill= "coral2") + 
                theme_bw() 

mean(data.allparks.wide$Diff.Ten)

## [1] -0.678478

sd(data.allparks.wide$Diff.Ten)

## [1] 6.077969

mean(data.allparks.wide$Diff.Hun)

## [1] -0.0678478

sd(data.allparks.wide$Diff.Hun)

## [1] 0.6077969

gg.diff.ten3 <- gg.diff.ten2 + 
                stat_function(fun = dnorm, colour="blue", size=1.2,
                args = list(mean = -0.678478, sd = 6.077969))
gg.diff.hun3 <- gg.diff.hun2 + 
                stat_function(fun = dnorm, colour="blue", size=1.2,
                args = list(mean = -0.0678478, sd = 0.6077969))
gg.diff.ten3

gg.diff.hun3

Plot 3 - boxplots of Calls/Night Differences by park and species

# Plot Calls Per Night Differences (y axis) by park (x axis) for each species. 

# Differences / 10
diff.box.ten <- ggplot(subset(data.allparks.wide), 
            aes(x=Park, y=Diff.Ten, color=Park)) + 
            geom_boxplot() + 
            scale_color_brewer(palette = "Dark2") +
            theme_bw() +
            facet_grid(cols = vars(Species), scales = "free_x") + 
            theme(axis.text.x.bottom = element_text(angle=90)) +
            theme(legend.position = "bottom")
diff.box.ten

# Differences / 100
diff.box.hun <- ggplot(subset(data.allparks.wide), 
            aes(x=Park, y=Diff.Hun, color=Park)) + 
            geom_boxplot() + 
            scale_color_brewer(palette = "Dark2") +
            theme_bw() +
            facet_grid(cols = vars(Species), scales = "free_x") + 
            theme(axis.text.x.bottom = element_text(angle=90)) +
            theme(legend.position = "bottom")
diff.box.hun

Test models with “Park” and “Species” as fixed effects vs. a null model

First do it with Differences / 10

# Use linear mixed models with "Site" as a random effect in all models. 
# Compare a null model, a model with only "Park" as a fixed effect, a model with only "Species" as a fixed effect, and models with "Park" and "Species" as fixed effects (with or without the interaction).  
# Set REML = FALSE so that log-likelihood is used, so that we can compare the models with AIC. 
mod.null.ten <- lmer(data.allparks.wide$Diff.Ten ~ 1 + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)
mod.park.ten <- lmer(data.allparks.wide$Diff.Ten ~ Park + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)
mod.species.ten <- lmer(data.allparks.wide$Diff.Ten ~ Species + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)
mod.parkspp.ten <- lmer(data.allparks.wide$Diff.Ten ~ Park + Species + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)
mod.parksppint.ten <- lmer(data.allparks.wide$Diff.Ten ~ Park*Species + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)

## fixed-effect model matrix is rank deficient so dropping 19 columns / coefficients

# The rank-deficient warning is not an issue. Not all park-species combinations exist.

# Use small sample AICc (using AICcmodavg pkg) to compare these models.
# (Note: aictab function automatically selected AICc instead of AIC, even when I did not include the argument "second.ord = TRUE".)
mod.list.ten <- list(mod.null.ten, mod.park.ten, mod.species.ten, mod.parkspp.ten, mod.parksppint.ten)
mod.names.ten <- c("mod.null.ten", "mod.park.ten", "mod.species.ten", "mod.parkspp.ten", "mod.parksppint.ten")
aictab(cand.set = mod.list.ten, modnames = mod.names.ten, sort = TRUE, second.ord = TRUE)

## 
## Model selection based on AICc:
## 
##                     K    AICc Delta_AICc AICcWt Cum.Wt       LL
## mod.parkspp.ten    18 7856.35       0.00   0.84   0.84 -3909.89
## mod.species.ten    11 7859.75       3.39   0.15   1.00 -3918.76
## mod.parksppint.ten 55 7870.98      14.63   0.00   1.00 -3877.87
## mod.park.ten       10 7929.58      73.23   0.00   1.00 -3954.70
## mod.null.ten        3 7932.79      76.43   0.00   1.00 -3963.38

Then do it with Differences / 100

mod.null.hun <- lmer(data.allparks.wide$Diff.Hun ~ 1 + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)
mod.park.hun <- lmer(data.allparks.wide$Diff.Hun ~ Park + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)
mod.species.hun <- lmer(data.allparks.wide$Diff.Hun ~ Species + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)
mod.parkspp.hun <- lmer(data.allparks.wide$Diff.Hun ~ Park + Species + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)
mod.parksppint.hun <- lmer(data.allparks.wide$Diff.Hun ~ Park*Species + (1|Site), 
                 data= data.allparks.wide, REML = FALSE)

## fixed-effect model matrix is rank deficient so dropping 19 columns / coefficients

# The rank-deficient warning is not an issue. Not all park-species combinations exist.

# Use small sample AICc (using AICcmodavg pkg) to compare these models.
# (Note: aictab function automatically selected AICc instead of AIC, even when I did not include the argument "second.ord = TRUE".)
mod.list.hun <- list(mod.null.hun, mod.park.hun, mod.species.hun, mod.parkspp.hun, mod.parksppint.hun)
mod.names.hun <- c("mod.null.hun", "mod.park.hun", "mod.species.hun", "mod.parkspp.hun", "mod.parksppint.hun")
aictab(cand.set = mod.list.hun, modnames = mod.names.hun, sort = TRUE, second.ord = TRUE)

## 
## Model selection based on AICc:
## 
##                     K    AICc Delta_AICc AICcWt Cum.Wt       LL
## mod.parkspp.hun    18 2191.99       0.00   0.84   0.84 -1077.71
## mod.species.hun    11 2195.39       3.39   0.15   1.00 -1086.58
## mod.parksppint.hun 55 2206.62      14.63   0.00   1.00 -1045.69
## mod.park.hun       10 2265.22      73.23   0.00   1.00 -1122.52
## mod.null.hun        3 2268.43      76.43   0.00   1.00 -1131.20

Look at differences among parks

First do it with Differences / 10

summary(mod.park.ten)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: data.allparks.wide$Diff.Ten ~ Park + (1 | Site)
##    Data: data.allparks.wide
## 
##      AIC      BIC   logLik deviance df.resid 
##   7929.4   7980.5  -3954.7   7909.4     1220 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -10.7864  -0.0844   0.1029   0.2235   7.3730 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Site     (Intercept)  0.3388  0.5821  
##  Residual             35.9959  5.9997  
## Number of obs: 1230, groups:  Site, 185
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  -0.9760     0.5139  -1.899
## ParkGRPO     -1.0238     0.7981  -1.283
## ParkINDU      1.5087     0.6565   2.298
## ParkISRO     -0.4656     0.7268  -0.641
## ParkMISS      0.3922     0.7018   0.559
## ParkSACN     -0.1325     0.7541  -0.176
## ParkSLBE      0.7571     0.6712   1.128
## ParkVOYA     -0.1778     0.7850  -0.227
## 
## Correlation of Fixed Effects:
##          (Intr) PrGRPO PrINDU PrISRO PrMISS PrSACN PrSLBE
## ParkGRPO -0.644                                          
## ParkINDU -0.783  0.504                                   
## ParkISRO -0.707  0.455  0.554                            
## ParkMISS -0.732  0.472  0.573  0.518                     
## ParkSACN -0.682  0.439  0.534  0.482  0.499              
## ParkSLBE -0.766  0.493  0.599  0.541  0.561  0.522       
## ParkVOYA -0.655  0.422  0.512  0.463  0.479  0.446  0.501

confint(mod.park.ten)

## Computing profile confidence intervals ...

##                  2.5 %     97.5 %
## .sig01       0.0000000 1.33638885
## .sigma       5.7507426 6.26455775
## (Intercept) -1.9877219 0.03565804
## ParkGRPO    -2.5949660 0.54731421
## ParkINDU     0.2141496 2.80328604
## ParkISRO    -1.8963658 0.96512555
## ParkMISS    -0.9902512 1.77462342
## ParkSACN    -1.6181441 1.35306828
## ParkSLBE    -0.5642512 2.07847389
## ParkVOYA    -1.7232278 1.36753614

car::Anova(mod.park.ten)

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: data.allparks.wide$Diff.Ten
##       Chisq Df Pr(>Chisq)  
## Park 18.436  7    0.01015 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

hyptest.mod.park.ten <- glht(mod.park.ten, linfct= mcp(Park = "Tukey"))
summary(hyptest.mod.park.ten)

## Warning in RET$pfunction("adjusted", ...): Completion with error > abseps

## Warning in RET$pfunction("adjusted", ...): Completion with error > abseps

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: lmer(formula = data.allparks.wide$Diff.Ten ~ Park + (1 | Site), 
##     data = data.allparks.wide, REML = FALSE)
## 
## Linear Hypotheses:
##                  Estimate Std. Error z value Pr(>|z|)  
## GRPO - APIS == 0 -1.02383    0.79807  -1.283   0.9039  
## INDU - APIS == 0  1.50872    0.65647   2.298   0.2908  
## ISRO - APIS == 0 -0.46562    0.72676  -0.641   0.9983  
## MISS - APIS == 0  0.39219    0.70176   0.559   0.9993  
## SACN - APIS == 0 -0.13254    0.75405  -0.176   1.0000  
## SLBE - APIS == 0  0.75711    0.67120   1.128   0.9499  
## VOYA - APIS == 0 -0.17785    0.78499  -0.227   1.0000  
## INDU - GRPO == 0  2.53254    0.73464   3.447   0.0129 *
## ISRO - GRPO == 0  0.55821    0.79807   0.699   0.9970  
## MISS - GRPO == 0  1.41601    0.77538   1.826   0.5982  
## SACN - GRPO == 0  0.89129    0.82300   1.083   0.9597  
## SLBE - GRPO == 0  1.78094    0.74783   2.381   0.2471  
## VOYA - GRPO == 0  0.84598    0.85144   0.994   0.9750  
## ISRO - INDU == 0 -1.97434    0.65647  -3.008   0.0529 .
## MISS - INDU == 0 -1.11653    0.62868  -1.776   0.6325  
## SACN - INDU == 0 -1.64126    0.68656  -2.391   0.2431  
## SLBE - INDU == 0 -0.75161    0.59437  -1.265   0.9104  
## VOYA - INDU == 0 -1.68656    0.72040  -2.341   0.2682  
## MISS - ISRO == 0  0.85781    0.70176   1.222   0.9242  
## SACN - ISRO == 0  0.33308    0.75405   0.442   0.9998  
## SLBE - ISRO == 0  1.22273    0.67120   1.822   0.6012  
## VOYA - ISRO == 0  0.28777    0.78499   0.367   1.0000  
## SACN - MISS == 0 -0.52472    0.72999  -0.719   0.9964  
## SLBE - MISS == 0  0.36493    0.64405   0.567   0.9992  
## VOYA - MISS == 0 -0.57003    0.76190  -0.748   0.9954  
## SLBE - SACN == 0  0.88965    0.70066   1.270   0.9086  
## VOYA - SACN == 0 -0.04531    0.81032  -0.056   1.0000  
## VOYA - SLBE == 0 -0.93496    0.73385  -1.274   0.9070  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

Then do it with Differences / 100

summary(mod.park.hun)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: data.allparks.wide$Diff.Hun ~ Park + (1 | Site)
##    Data: data.allparks.wide
## 
##      AIC      BIC   logLik deviance df.resid 
##   2265.0   2316.2  -1122.5   2245.0     1220 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -10.7864  -0.0844   0.1029   0.2235   7.3730 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Site     (Intercept) 0.003388 0.05821 
##  Residual             0.359959 0.59997 
## Number of obs: 1230, groups:  Site, 185
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) -0.09760    0.05139  -1.899
## ParkGRPO    -0.10238    0.07981  -1.283
## ParkINDU     0.15087    0.06565   2.298
## ParkISRO    -0.04656    0.07268  -0.641
## ParkMISS     0.03922    0.07018   0.559
## ParkSACN    -0.01325    0.07541  -0.176
## ParkSLBE     0.07571    0.06712   1.128
## ParkVOYA    -0.01778    0.07850  -0.227
## 
## Correlation of Fixed Effects:
##          (Intr) PrGRPO PrINDU PrISRO PrMISS PrSACN PrSLBE
## ParkGRPO -0.644                                          
## ParkINDU -0.783  0.504                                   
## ParkISRO -0.707  0.455  0.554                            
## ParkMISS -0.732  0.472  0.573  0.518                     
## ParkSACN -0.682  0.439  0.534  0.482  0.499              
## ParkSLBE -0.766  0.493  0.599  0.541  0.561  0.522       
## ParkVOYA -0.655  0.422  0.512  0.463  0.479  0.446  0.501

confint(mod.park.hun)

## Computing profile confidence intervals ...

## Warning in optwrap(optimizer, par = start, fn = function(x)
## dd(mkpar(npar1, : convergence code 3 from bobyqa: bobyqa -- a trust region
## step failed to reduce q

##                   2.5 %      97.5 %
## .sig01       0.00000000 0.133638888
## .sigma       0.57507426 0.626455775
## (Intercept) -0.19877219 0.003565804
## ParkGRPO    -0.25949660 0.054731421
## ParkINDU     0.02141496 0.280328604
## ParkISRO    -0.18963658 0.096512555
## ParkMISS    -0.09902512 0.177462342
## ParkSACN    -0.16181441 0.135306828
## ParkSLBE    -0.05642512 0.207847389
## ParkVOYA    -0.17232278 0.136753614

car::Anova(mod.park.hun)

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: data.allparks.wide$Diff.Hun
##       Chisq Df Pr(>Chisq)  
## Park 18.436  7    0.01015 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

hyptest.mod.park.hun <- glht(mod.park.hun, linfct= mcp(Park = "Tukey"))
summary(hyptest.mod.park.hun)

## Warning in RET$pfunction("adjusted", ...): Completion with error > abseps

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Multiple Comparisons of Means: Tukey Contrasts
## 
## 
## Fit: lmer(formula = data.allparks.wide$Diff.Hun ~ Park + (1 | Site), 
##     data = data.allparks.wide, REML = FALSE)
## 
## Linear Hypotheses:
##                   Estimate Std. Error z value Pr(>|z|)  
## GRPO - APIS == 0 -0.102383   0.079807  -1.283   0.9039  
## INDU - APIS == 0  0.150872   0.065647   2.298   0.2904  
## ISRO - APIS == 0 -0.046562   0.072676  -0.641   0.9983  
## MISS - APIS == 0  0.039219   0.070176   0.559   0.9993  
## SACN - APIS == 0 -0.013254   0.075405  -0.176   1.0000  
## SLBE - APIS == 0  0.075711   0.067120   1.128   0.9500  
## VOYA - APIS == 0 -0.017785   0.078499  -0.227   1.0000  
## INDU - GRPO == 0  0.253254   0.073464   3.447   0.0130 *
## ISRO - GRPO == 0  0.055821   0.079807   0.699   0.9970  
## MISS - GRPO == 0  0.141601   0.077538   1.826   0.5980  
## SACN - GRPO == 0  0.089129   0.082300   1.083   0.9597  
## SLBE - GRPO == 0  0.178094   0.074783   2.381   0.2476  
## VOYA - GRPO == 0  0.084598   0.085144   0.994   0.9750  
## ISRO - INDU == 0 -0.197434   0.065647  -3.008   0.0528 .
## MISS - INDU == 0 -0.111653   0.062868  -1.776   0.6330  
## SACN - INDU == 0 -0.164126   0.068656  -2.391   0.2424  
## SLBE - INDU == 0 -0.075161   0.059437  -1.265   0.9104  
## VOYA - INDU == 0 -0.168656   0.072040  -2.341   0.2676  
## MISS - ISRO == 0  0.085781   0.070176   1.222   0.9242  
## SACN - ISRO == 0  0.033308   0.075405   0.442   0.9998  
## SLBE - ISRO == 0  0.122273   0.067120   1.822   0.6012  
## VOYA - ISRO == 0  0.028777   0.078499   0.367   1.0000  
## SACN - MISS == 0 -0.052472   0.072999  -0.719   0.9964  
## SLBE - MISS == 0  0.036493   0.064405   0.567   0.9992  
## VOYA - MISS == 0 -0.057003   0.076190  -0.748   0.9954  
## SLBE - SACN == 0  0.088965   0.070066   1.270   0.9086  
## VOYA - SACN == 0 -0.004531   0.081032  -0.056   1.0000  
## VOYA - SLBE == 0 -0.093496   0.073385  -1.274   0.9071  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

Conclusions: Everything came out the same as before using both the Differences/10 and the Differences/100. Same order of AICc scores for the five models. Same result of Tukey test comparing parks to each other, only one pair of parks was signficant (INDU with GRPO).

Test a new version of the “Species” vs. null models, now with Park/Site as nested random effects

First for Differences / 10

mod.null.ten2 <- lmer(data.allparks.wide$Diff.Ten ~ 1 + (1|Park/Site), 
                   data= data.allparks.wide, REML = FALSE)
mod.species.ten2 <- lmer(data.allparks.wide$Diff.Ten ~ Species + (1|Park/Site), 
                   data= data.allparks.wide, REML = FALSE)

mod.list.ten2 <- list(mod.null.ten2, mod.species.ten2)
mod.names.ten2 <- c("mod.null.ten2", "mod.species.ten2")
aictab(cand.set = mod.list.ten2, modnames = mod.names.ten2, sort = TRUE, second.ord = TRUE)

## 
## Model selection based on AICc:
## 
##                   K    AICc Delta_AICc AICcWt Cum.Wt       LL
## mod.species.ten2 12 7858.60       0.00      1      1 -3917.17
## mod.null.ten2     4 7931.35      72.75      0      1 -3961.66

Then for Differences / 10

mod.null.hun2 <- lmer(data.allparks.wide$Diff.Hun ~ 1 + (1|Park/Site), 
                   data= data.allparks.wide, REML = FALSE)
mod.species.hun2 <- lmer(data.allparks.wide$Diff.Hun ~ Species + (1|Park/Site), 
                   data= data.allparks.wide, REML = FALSE)

mod.list.hun2 <- list(mod.null.hun2, mod.species.hun2)
mod.names.hun2 <- c("mod.null.hun2", "mod.species.hun2")
aictab(cand.set = mod.list.hun2, modnames = mod.names.hun2, sort = TRUE, second.ord = TRUE)

## 
## Model selection based on AICc:
## 
##                   K    AICc Delta_AICc AICcWt Cum.Wt       LL
## mod.species.hun2 12 2194.24       0.00      1      1 -1084.99
## mod.null.hun2     4 2266.99      72.75      0      1 -1129.48

Conclusions: In both cases, the two models are different and the AIC score is much lower when we include the “Species” term. “Species” therefore still seems to be contributing importantly to this model.

The final version of these two models will be: Diff.Ten ~ 0 + Species + (1|Park/Site) and Diff.Hun ~ 0 + Species + (1|Park/Site)

Run the final models and check residual plots

# Run the newest model.
mod.final.ten <- lmer(data.allparks.wide$Diff.Ten ~ 0 + Species + (1|Park/Site), 
                   data= data.allparks.wide, REML = FALSE)

mod.final.hun <- lmer(data.allparks.wide$Diff.Hun ~ 0 + Species + (1|Park/Site), 
                   data= data.allparks.wide, REML = FALSE)

# Plot the residuals. 
plot(fitted(mod.final.ten),residuals(mod.final.ten))

hist(residuals(mod.final.ten))

qqnorm(residuals(mod.final.ten))

plot(fitted(mod.final.hun),residuals(mod.final.hun))

hist(residuals(mod.final.hun))

qqnorm(residuals(mod.final.hun))

# None of these look any different or better.

Conclusions: The residual plots all look approximiately the same as before, no real improvement here.

Look at differences among species

# Look at the model outputs.
summary(mod.final.ten)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: data.allparks.wide$Diff.Ten ~ 0 + Species + (1 | Park/Site)
##    Data: data.allparks.wide
## 
##      AIC      BIC   logLik deviance df.resid 
##   7858.3   7919.7  -3917.2   7834.3     1218 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -11.3014  -0.1110   0.0229   0.1971   8.1057 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev.
##  Site:Park (Intercept)  1.0665  1.0327  
##  Park      (Intercept)  0.3266  0.5715  
##  Residual              33.0196  5.7463  
## Number of obs: 1230, groups:  Site:Park, 185; Park, 8
## 
## Fixed effects:
##              Estimate Std. Error t value
## SpeciesEPFU  0.829959   0.475386   1.746
## SpeciesLABO -1.230168   0.475386  -2.588
## SpeciesLACI  0.005076   0.475386   0.011
## SpeciesLANO  0.332668   0.475386   0.700
## SpeciesMYLU -4.211837   0.475386  -8.860
## SpeciesMYSE -0.507552   0.475386  -1.068
## SpeciesMYSO -1.012359   1.194660  -0.847
## SpeciesNYHU -1.380982   1.194660  -1.156
## SpeciesPESU -0.478630   0.750598  -0.638
## 
## Correlation of Fixed Effects:
##             SpEPFU SpLABO SpLACI SpLANO SpMYLU SpMYSE SpMYSO SpNYHU
## SpeciesLABO 0.210                                                  
## SpeciesLACI 0.210  0.210                                           
## SpeciesLANO 0.210  0.210  0.210                                    
## SpeciesMYLU 0.210  0.210  0.210  0.210                             
## SpeciesMYSE 0.210  0.210  0.210  0.210  0.210                      
## SpeciesMYSO 0.086  0.086  0.086  0.086  0.086  0.086               
## SpeciesNYHU 0.086  0.086  0.086  0.086  0.086  0.086  0.110        
## SpeciesPESU 0.132  0.132  0.132  0.132  0.132  0.132  0.090  0.090

confint(mod.final.ten)

## Computing profile confidence intervals ...

##                  2.5 %     97.5 %
## .sig01       0.0000000  1.6137754
## .sig02       0.0000000  1.2874074
## .sigma       5.5078637  6.0023864
## SpeciesEPFU -0.1268021  1.7691051
## SpeciesLABO -2.1869291 -0.2910220
## SpeciesLACI -0.9516848  0.9442223
## SpeciesLANO -0.6240929  1.2718142
## SpeciesMYLU -5.1685978 -3.2726906
## SpeciesMYSE -1.4643135  0.4315937
## SpeciesMYSO -3.4401746  1.3903374
## SpeciesNYHU -3.8087977  1.0217143
## SpeciesPESU -1.9915921  0.9997615

car::Anova(mod.final.ten)

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: data.allparks.wide$Diff.Ten
##          Chisq Df Pr(>Chisq)    
## Species 101.35  9  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(mod.final.hun)

## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: data.allparks.wide$Diff.Hun ~ 0 + Species + (1 | Park/Site)
##    Data: data.allparks.wide
## 
##      AIC      BIC   logLik deviance df.resid 
##   2194.0   2255.4  -1085.0   2170.0     1218 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -11.3014  -0.1110   0.0229   0.1971   8.1057 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev.
##  Site:Park (Intercept) 0.010665 0.10327 
##  Park      (Intercept) 0.003266 0.05715 
##  Residual              0.330196 0.57463 
## Number of obs: 1230, groups:  Site:Park, 185; Park, 8
## 
## Fixed effects:
##               Estimate Std. Error t value
## SpeciesEPFU  0.0829959  0.0475386   1.746
## SpeciesLABO -0.1230168  0.0475386  -2.588
## SpeciesLACI  0.0005076  0.0475386   0.011
## SpeciesLANO  0.0332668  0.0475386   0.700
## SpeciesMYLU -0.4211837  0.0475386  -8.860
## SpeciesMYSE -0.0507552  0.0475386  -1.068
## SpeciesMYSO -0.1012359  0.1194660  -0.847
## SpeciesNYHU -0.1380982  0.1194660  -1.156
## SpeciesPESU -0.0478630  0.0750598  -0.638
## 
## Correlation of Fixed Effects:
##             SpEPFU SpLABO SpLACI SpLANO SpMYLU SpMYSE SpMYSO SpNYHU
## SpeciesLABO 0.210                                                  
## SpeciesLACI 0.210  0.210                                           
## SpeciesLANO 0.210  0.210  0.210                                    
## SpeciesMYLU 0.210  0.210  0.210  0.210                             
## SpeciesMYSE 0.210  0.210  0.210  0.210  0.210                      
## SpeciesMYSO 0.086  0.086  0.086  0.086  0.086  0.086               
## SpeciesNYHU 0.086  0.086  0.086  0.086  0.086  0.086  0.110        
## SpeciesPESU 0.132  0.132  0.132  0.132  0.132  0.132  0.090  0.090

confint(mod.final.hun)

## Computing profile confidence intervals ...

##                   2.5 %      97.5 %
## .sig01       0.00000000  0.16137754
## .sig02       0.00000000  0.12874075
## .sigma       0.55078637  0.60023864
## SpeciesEPFU -0.01268021  0.17691051
## SpeciesLABO -0.21869291 -0.02910220
## SpeciesLACI -0.09516848  0.09442223
## SpeciesLANO -0.06240929  0.12718142
## SpeciesMYLU -0.51685978 -0.32726906
## SpeciesMYSE -0.14643135  0.04315937
## SpeciesMYSO -0.34401746  0.13903374
## SpeciesNYHU -0.38087977  0.10217143
## SpeciesPESU -0.19915921  0.09997615

car::Anova(mod.final.hun)

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: data.allparks.wide$Diff.Hun
##          Chisq Df Pr(>Chisq)    
## Species 101.35  9  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

# Run a hypothesis test to check each species against the null hyp of "equal to zero". 
hyptest.species.ten <- glht(mod.final.ten, linfct= (Species = c("SpeciesEPFU = 0",  "SpeciesLABO = 0", "SpeciesLACI = 0", "SpeciesLANO = 0", "SpeciesMYLU = 0", "SpeciesMYSE = 0", "SpeciesMYSO = 0", "SpeciesNYHU = 0", "SpeciesPESU = 0")))
summary(hyptest.species.ten)

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Fit: lmer(formula = data.allparks.wide$Diff.Ten ~ 0 + Species + (1 | 
##     Park/Site), data = data.allparks.wide, REML = FALSE)
## 
## Linear Hypotheses:
##                   Estimate Std. Error z value Pr(>|z|)    
## SpeciesEPFU == 0  0.829959   0.475386   1.746   0.5132    
## SpeciesLABO == 0 -1.230168   0.475386  -2.588   0.0816 .  
## SpeciesLACI == 0  0.005076   0.475386   0.011   1.0000    
## SpeciesLANO == 0  0.332668   0.475386   0.700   0.9968    
## SpeciesMYLU == 0 -4.211837   0.475386  -8.860   <1e-04 ***
## SpeciesMYSE == 0 -0.507552   0.475386  -1.068   0.9441    
## SpeciesMYSO == 0 -1.012359   1.194660  -0.847   0.9873    
## SpeciesNYHU == 0 -1.380982   1.194660  -1.156   0.9124    
## SpeciesPESU == 0 -0.478630   0.750598  -0.638   0.9984    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

hyptest.species.hun <- glht(mod.final.hun, linfct= (Species = c("SpeciesEPFU = 0",  "SpeciesLABO = 0", "SpeciesLACI = 0", "SpeciesLANO = 0", "SpeciesMYLU = 0", "SpeciesMYSE = 0", "SpeciesMYSO = 0", "SpeciesNYHU = 0", "SpeciesPESU = 0")))
summary(hyptest.species.hun)

## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Fit: lmer(formula = data.allparks.wide$Diff.Hun ~ 0 + Species + (1 | 
##     Park/Site), data = data.allparks.wide, REML = FALSE)
## 
## Linear Hypotheses:
##                    Estimate Std. Error z value Pr(>|z|)    
## SpeciesEPFU == 0  0.0829959  0.0475386   1.746   0.5132    
## SpeciesLABO == 0 -0.1230168  0.0475386  -2.588   0.0816 .  
## SpeciesLACI == 0  0.0005076  0.0475386   0.011   1.0000    
## SpeciesLANO == 0  0.0332668  0.0475386   0.700   0.9968    
## SpeciesMYLU == 0 -0.4211837  0.0475386  -8.860   <1e-04 ***
## SpeciesMYSE == 0 -0.0507552  0.0475386  -1.068   0.9441    
## SpeciesMYSO == 0 -0.1012359  0.1194660  -0.847   0.9873    
## SpeciesNYHU == 0 -0.1380982  0.1194660  -1.156   0.9124    
## SpeciesPESU == 0 -0.0478630  0.0750598  -0.638   0.9984    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

Conclusions: Same result as previously. There is one species that is significantly different than zero (MYLU).

Overall Conclusions: Dividing “Differences” data by 10 or by 100 does not change the selected model, nor does it change the results in terms of significant species or park differences. As before, differences in calls per night between 2017 and 2016 depend on species, when we account for inherent variation among the parks and sites. For most species we did not observe a significant difference in calls per night between 2017 and 2016. However, for MYLU there was a significant difference, specifically there was a decline from 2016 to 2017.

Bat data models with Differences Divided

Katy Goodwin

February 15, 2019

Set up stuff

Data organization

Dividing Difference Data by 10 or 100

Plot 1 - histogram of counts

Plot 2 - histogram of densities, density estimate, and probability density function

Plot 3 - boxplots of Calls/Night Differences by park and species

Test models with “Park” and “Species” as fixed effects vs. a null model

First do it with Differences / 10

Then do it with Differences / 100

Look at differences among parks

First do it with Differences / 10

Then do it with Differences / 100

Test a new version of the “Species” vs. null models, now with Park/Site as nested random effects

First for Differences / 10

Then for Differences / 10

Run the final models and check residual plots

Look at differences among species