Introduction

Dyje bird surveys. Is there an effect of human disturbance on avian biodiversity?

Data

There are three different methods and a summary. First step was to create a tibble with the first three sheets, adding a fourth column for method, cleaning the names, and coding variables as factor/integer as needed.

There were also some non-bird species observations (Lutra, Ondatra, Dama dama, chiroptera…) that were removed.

#Raw file in folder raw. Made a copy of XXbrid 2019, named it "disturbance" and deleted last summary sheet


disturbance_raw <- read_excel(here ("data", "raw data", "data birds for test.xlsx"))

write.csv(disturbance_raw, here("data", "disturbance.csv"))


tidy_dist <- disturbance_raw %>% 
  clean_names(., "small_camel") %>%
  # dplyr::mutate(angling = recode(angling, 
  #                                 "yes"= "1",
  #                                 "no" = "0")
  #                                    ) %>% 
  dplyr::mutate(species = as.factor(species)) %>% 
  dplyr::mutate(angling = as.factor(angling)) %>% 
  dplyr::mutate(disturbance = as.factor(disturbance)) 
  
  
  vegan_data <- tidy_dist %>% # Data for package vegan.
  pivot_wider(names_from = species, values_from = countOfSpecies) %>% #species as variables
  dplyr::select(!c(no, omitted, "Dama dama", "Castor fiber")) %>% #remove invalid species 
  mutate_all(~replace(., is.na(.), 0))  # replace NA with 0

Does species diversity vary across sites?

We will compute three measures of species diversity for each site, using only the improved method.

  • Richness, the total number of unique species found at a site
  • Shannon-Weiner diversity, a common diversity index where species are weighted by abundance
  • Rarefied richness, or species richness estimated with a resampling procedure to control for variation in sample size.

I used the vegan package in R to calculate those indexes (see code)

# bird abundance matrix
abundance.matrix <- vegan_data[,4:66]

# store computed indices in a new data frame called 'indices'
indices <- vegan_data[,c("name","disturbance")]
indices$Richness <- rowSums(abundance.matrix>0)
indices$Shannon <- diversity(abundance.matrix) # shannon is default
indices$Rarefied <- rarefy(abundance.matrix, sample = 5)

Check normality by groups

Shapiro wilk test by goups

rich <- indices %>%
  group_by(disturbance) %>%
  shapiro_test(Richness)

shan <- indices %>%
  group_by(disturbance) %>%
  shapiro_test(Shannon)

rar <-indices %>%
  group_by(disturbance) %>%
  shapiro_test(Rarefied)

rich
## # A tibble: 2 x 4
##   disturbance variable statistic       p
##   <fct>       <chr>        <dbl>   <dbl>
## 1 no          Richness     0.954 0.238  
## 2 yes         Richness     0.947 0.00310
shan
## # A tibble: 2 x 4
##   disturbance variable statistic       p
##   <fct>       <chr>        <dbl>   <dbl>
## 1 no          Shannon      0.914 0.0210 
## 2 yes         Shannon      0.946 0.00273
rar
## # A tibble: 2 x 4
##   disturbance variable statistic            p
##   <fct>       <chr>        <dbl>        <dbl>
## 1 no          Rarefied     0.888 0.00501     
## 2 yes         Rarefied     0.802 0.0000000103

At least one of each group is not normally distributed

QQ plot by group

richPlot <- ggqqplot(indices, x = "Richness", facet.by = "disturbance")
shanPlot <- ggqqplot(indices, x = "Shannon", facet.by = "disturbance")
rarPlot <- ggqqplot(indices, x = "Rarefied", facet.by = "disturbance")

richPlot

shanPlot

rarPlot

Those look like many 0s…

Check the equality of variances

This can be done using the Levene’s test. If the variances of groups are equal, the p-value should be greater than 0.05.

richLev <- indices %>% 
  levene_test(Richness ~ disturbance)

shanLev <- indices %>% 
  levene_test(Shannon ~ disturbance)

rarLev <- indices %>% 
  levene_test(Rarefied ~ disturbance)

richLev
## # A tibble: 1 x 4
##     df1   df2 statistic     p
##   <int> <int>     <dbl> <dbl>
## 1     1   103      2.73 0.102
shanLev
## # A tibble: 1 x 4
##     df1   df2 statistic     p
##   <int> <int>     <dbl> <dbl>
## 1     1   103    0.0414 0.839
rarLev
## # A tibble: 1 x 4
##     df1   df2 statistic     p
##   <int> <int>     <dbl> <dbl>
## 1     1   103     0.890 0.348

Richness variances are equal in the two groups.

Wilcoxon rank sum test

We will use the Wilcoxon rank sum test, which is a non-parametric alternative to the independent two samples t-test for comparing two independent groups of samples, in the situation where the data are not normally distributed (I write these things to learn them myself by repetition :D )

Summary statistics

First, some summary statistics by groups: median and interquartile range (IQR).

richStats <- indices %>%
  group_by(disturbance) %>%
  get_summary_stats(Richness, type = "median_iqr")

shanStats <- indices %>%
  group_by(disturbance) %>%
  get_summary_stats(Shannon, type = "median_iqr")

rarStats <- indices %>%
  group_by(disturbance) %>%
  get_summary_stats(Rarefied, type = "median_iqr")

richStats
## # A tibble: 2 x 5
##   disturbance variable     n median   iqr
##   <fct>       <chr>    <dbl>  <dbl> <dbl>
## 1 no          Richness    29      4     3
## 2 yes         Richness    76      5     6
shanStats
## # A tibble: 2 x 5
##   disturbance variable     n median   iqr
##   <fct>       <chr>    <dbl>  <dbl> <dbl>
## 1 no          Shannon     29   1.39 0.693
## 2 yes         Shannon     76   1.61 1.01
rarStats
## # A tibble: 2 x 5
##   disturbance variable     n median   iqr
##   <fct>       <chr>    <dbl>  <dbl> <dbl>
## 1 no          Rarefied    29   4     1.64
## 2 yes         Rarefied    76   4.47  1.76

Boxplots

bxpRich <- ggboxplot(
  indices, x = "disturbance", y = "Richness", 
  ylab = "Richness", xlab = "Disturbance", add = "jitter"
  )
bxpRich

bxpShan <- ggboxplot(
  indices, x = "disturbance", y = "Shannon", 
  ylab = "Shannon-Weiner diversity", xlab = "Disturbance", add = "jitter"
  )
bxpShan

bxpRar <- ggboxplot(
  indices, x = "disturbance", y = "Rarefied", 
  ylab = "Rarefied richness", xlab = "Disturbance", add = "jitter"
  )
bxpRar

Is there any significant difference in richness/Shannon diversity/Rarefied richness between disturbance and not disturbance sites?

wilcoxRich <- indices %>% 
  rstatix::wilcox_test(Richness ~ disturbance) %>%
  add_significance()
wilcoxRich
## # A tibble: 1 x 8
##   .y.      group1 group2    n1    n2 statistic      p p.signif
## * <chr>    <chr>  <chr>  <int> <int>     <dbl>  <dbl> <chr>   
## 1 Richness no     yes       29    76      868. 0.0933 ns
wilcoxShan <- indices %>% 
  rstatix::wilcox_test(Shannon ~ disturbance) %>%
  add_significance()
wilcoxShan
## # A tibble: 1 x 8
##   .y.     group1 group2    n1    n2 statistic      p p.signif
## * <chr>   <chr>  <chr>  <int> <int>     <dbl>  <dbl> <chr>   
## 1 Shannon no     yes       29    76      864. 0.0887 ns
wilcoxRar <- indices %>% 
  rstatix::wilcox_test(Rarefied ~ disturbance) %>%
  add_significance()
wilcoxRar
## # A tibble: 1 x 8
##   .y.      group1 group2    n1    n2 statistic     p p.signif
## * <chr>    <chr>  <chr>  <int> <int>     <dbl> <dbl> <chr>   
## 1 Rarefied no     yes       29    76      906.  0.16 ns

Effect size

library(coin)
effRich <- indices %>% wilcox_effsize(Richness ~ disturbance)

effShan <- indices %>% wilcox_effsize(Shannon ~ disturbance)

effRar <- indices %>% wilcox_effsize(Rarefied ~ disturbance)

effRich
## # A tibble: 1 x 7
##   .y.      group1 group2 effsize    n1    n2 magnitude
## * <chr>    <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 Richness no     yes      0.164    29    76 small
effShan
## # A tibble: 1 x 7
##   .y.     group1 group2 effsize    n1    n2 magnitude
## * <chr>   <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 Shannon no     yes      0.166    29    76 small
effRar
## # A tibble: 1 x 7
##   .y.      group1 group2 effsize    n1    n2 magnitude
## * <chr>    <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 Rarefied no     yes      0.137    29    76 small

Report

The summary result paragraphs could be something like the ones below.

The effect sizes are pretty small, there are many zeroes… in general, these analyses should be treated with care.

  • The median richness in sites were disturbance was not allowed was 4 (IQR = 3), whereas the median in disturbance sites was 5 (IQR = 6). The Wilcoxon test showed that the difference was not significant (p = 0.09, effect size r = 0.16).

  • The median Shannon-Weiner diversity index in sites were disturbance was not allowed was 1.39 (IQR = 0.7), whereas the median in disturbance sites was 1.61 (IQR = 1.009). The Wilcoxon test showed that the difference was not significant (p = 0.09, effect size r = 0.17).

  • The median Rarefied richness in sites were disturbance was not allowed was 4 (IQR = 1.64), whereas the median in disturbance sites was 4.47 (IQR = 1.76). The Wilcoxon test showed that the difference was not significant (p = 0.16, effect size r = 0.14).

Publication-ready plots

wilcoxRich <- wilcoxRich %>% add_xy_position(x = "Disturbance")
bxpRich + 
  stat_pvalue_manual(wilcoxRich, tip.length = 0) +
  labs(subtitle = get_test_label(wilcoxRich, detailed = TRUE))

wilcoxShan <- wilcoxShan %>% add_xy_position(x = "Disturbance")
bxpShan + 
  stat_pvalue_manual(wilcoxShan, tip.length = 0) +
  labs(subtitle = get_test_label(wilcoxShan, detailed = TRUE))

wilcoxRar <- wilcoxRar %>% add_xy_position(x = "Disturbance")
bxpRar + 
  stat_pvalue_manual(wilcoxRar, tip.length = 0) +
  labs(subtitle = get_test_label(wilcoxRar, detailed = TRUE))