Introduction

Dyje bird surveys. Is there an effect of angling on the avian biodiversity?

Data

There are three different methods and a summary. First step was to create a tibble with the first three sheets, adding a fourth column for method, cleaning the names, and coding variables as factor/integer as needed.

There were also some non-bird species observations (Lutra, Ondatra, Dama dama, chiroptera…) that were removed.

#Raw file in folder raw. Made a copy of XXbrid 2019, named it "birds" and deleted last summary sheet

path <- here("data", "birds.xlsx")

raw  <- path %>%
  excel_sheets() %>% #collects all sheets in one tibble
  set_names() %>%
  map_df(read_excel,
         path = path,
         .id = "method") #adds a column to identify each sheet


tidy_visit <- raw %>% 
  clean_names(., "small_camel") %>%
  # dplyr::mutate(angling = recode(angling, 
  #                                 "yes"= "1",
  #                                 "no" = "0")
  #                                    ) %>% 
  dplyr::mutate(name = as.factor(name)) %>% 
  dplyr::rename(site = name) %>% 
  dplyr::mutate(species = as.factor(species)) %>% 
  dplyr::mutate(angling = as.factor(angling)) %>% 
  dplyr::mutate(method = as.factor(method)) 
glimpse(tidy_visit)
## Rows: 732
## Columns: 5
## $ method         <fct> sitting, sitting, sitting, sitting, sitting, sitting, s~
## $ site           <fct> T022, T059, T059, T059, T022, T059, T087, T087, T096, T~
## $ angling        <fct> no, no, no, no, no, no, no, no, yes, yes, yes, yes, yes~
## $ species        <fct> Alcedo atthis, Anas platyrhynchos, Cygnus olor, Lutra l~
## $ countOfSpecies <dbl> 1, 6, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 5, 1, 6, 1, 1, 1~
veganData <- tidy_visit %>% # Data for package vegan.
  pivot_wider(names_from = species, values_from = countOfSpecies) %>% #species as variables
  dplyr::select(!c(no, omitted, "Dama dama", "Lutra lutra","Castor fiber", Chiroptera, "Ondatra zibethica")) %>% #remove invalid species 
  mutate_all(~replace(., is.na(.), 0))  # replace NA with 0
  
  veganNoAccidental <- veganData %>% 
    dplyr::filter(method == c("sitting","improved")) # Only data from improved and sitting methods
  
  veganSitting <- veganData %>% 
    dplyr::filter(method == "sitting") # Only data from sitting method
  
veganImproved <- veganData %>% 
    dplyr::filter(method == "improved") # Only data from improved method

Does species diversity vary across sites?

We will compute three measures of species diversity for each site, using only the improved method.

  • Richness, the total number of unique species found at a site
  • Shannon-Weiner diversity, a common diversity index where species are weighted by abundance
  • Rarefied richness, or species richness estimated with a resampling procedure to control for variation in sample size.

I used the vegan package in R to calculate those indexes (see code)

# bird abundance matrix
abundance.matrix <- veganImproved[,4:73]

# store computed indices in a new data frame called 'indices'
indices <- veganImproved[,c("method","site","angling")]
indices$Richness <- rowSums(abundance.matrix>0)
indices$Shannon <- diversity(abundance.matrix) # shannon is default
indices$Rarefied <- rarefy(abundance.matrix, sample = 5)

Check normality by groups

Shapiro wilk test by goups

rich <- indices %>%
  group_by(angling) %>%
  shapiro_test(Richness)

shan <- indices %>%
  group_by(angling) %>%
  shapiro_test(Shannon)

rar <-indices %>%
  group_by(angling) %>%
  shapiro_test(Rarefied)

rich
## # A tibble: 2 x 4
##   angling variable statistic       p
##   <fct>   <chr>        <dbl>   <dbl>
## 1 no      Richness     0.964 0.221  
## 2 yes     Richness     0.940 0.00338
shan
## # A tibble: 2 x 4
##   angling variable statistic       p
##   <fct>   <chr>        <dbl>   <dbl>
## 1 no      Shannon      0.904 0.00246
## 2 yes     Shannon      0.947 0.00732
rar
## # A tibble: 2 x 4
##   angling variable statistic            p
##   <fct>   <chr>        <dbl>        <dbl>
## 1 no      Rarefied     0.862 0.000178    
## 2 yes     Rarefied     0.806 0.0000000812

At least one of each group is not normally distributed

QQ plot by group

richPlot <- ggqqplot(indices, x = "Richness", facet.by = "angling")
shanPlot <- ggqqplot(indices, x = "Shannon", facet.by = "angling")
rarPlot <- ggqqplot(indices, x = "Rarefied", facet.by = "angling")

richPlot

shanPlot

rarPlot

Those look like many 0s…

Check the equality of variances

This can be done using the Levene’s test. If the variances of groups are equal, the p-value should be greater than 0.05.

richLev <- indices %>% 
  levene_test(Richness ~ angling)

shanLev <- indices %>% 
  levene_test(Shannon ~ angling)

rarLev <- indices %>% 
  levene_test(Rarefied ~ angling)

richLev
## # A tibble: 1 x 4
##     df1   df2 statistic       p
##   <int> <int>     <dbl>   <dbl>
## 1     1   103      7.76 0.00635
shanLev
## # A tibble: 1 x 4
##     df1   df2 statistic     p
##   <int> <int>     <dbl> <dbl>
## 1     1   103    0.0437 0.835
rarLev
## # A tibble: 1 x 4
##     df1   df2 statistic      p
##   <int> <int>     <dbl>  <dbl>
## 1     1   103      3.99 0.0485

Richness variances are not equal in the two groups. They are for Shannon ids and Rarefied

Wilcoxon rank sum test

We will use the Wilcoxon rank sum test, which is a non-parametric alternative to the independent two samples t-test for comparing two independent groups of samples, in the situation where the data are not normally distributed (I write these things to learn them myself by repetition :D )

Summary statistics

First, some summary statistics by groups: median and interquartile range (IQR).

richStats <- indices %>%
  group_by(angling) %>%
  get_summary_stats(Richness, type = "median_iqr")

shanStats <- indices %>%
  group_by(angling) %>%
  get_summary_stats(Shannon, type = "median_iqr")

rarStats <- indices %>%
  group_by(angling) %>%
  get_summary_stats(Rarefied, type = "median_iqr")

richStats
## # A tibble: 2 x 5
##   angling variable     n median   iqr
##   <fct>   <chr>    <dbl>  <dbl> <dbl>
## 1 no      Richness    40      4  3.25
## 2 yes     Richness    65      6  6
shanStats
## # A tibble: 2 x 5
##   angling variable     n median   iqr
##   <fct>   <chr>    <dbl>  <dbl> <dbl>
## 1 no      Shannon     40   1.39 0.839
## 2 yes     Shannon     65   1.75 1.06
rarStats
## # A tibble: 2 x 5
##   angling variable     n median   iqr
##   <fct>   <chr>    <dbl>  <dbl> <dbl>
## 1 no      Rarefied    40   4     1.97
## 2 yes     Rarefied    65   4.52  1.72

Boxplots

bxpRich <- ggboxplot(
  indices, x = "angling", y = "Richness", 
  ylab = "Richness", xlab = "Angling", add = "jitter"
  )
bxpRich

bxpShan <- ggboxplot(
  indices, x = "angling", y = "Shannon", 
  ylab = "Shannon-Weiner diversity", xlab = "Angling", add = "jitter"
  )
bxpShan

bxpRar <- ggboxplot(
  indices, x = "angling", y = "Rarefied", 
  ylab = "Rarefied richness", xlab = "Angling", add = "jitter"
  )
bxpRar

Is there any significant difference in richness/Shannon diversity/Rarefied richness between angling and not angling sites?

wilcoxRich <- indices %>% 
  rstatix::wilcox_test(Richness ~ angling) %>%
  add_significance()
wilcoxRich
## # A tibble: 1 x 8
##   .y.      group1 group2    n1    n2 statistic       p p.signif
## * <chr>    <chr>  <chr>  <int> <int>     <dbl>   <dbl> <chr>   
## 1 Richness no     yes       40    65      902. 0.00846 **
wilcoxShan <- indices %>% 
  rstatix::wilcox_test(Shannon ~ angling) %>%
  add_significance()
wilcoxShan
## # A tibble: 1 x 8
##   .y.     group1 group2    n1    n2 statistic       p p.signif
## * <chr>   <chr>  <chr>  <int> <int>     <dbl>   <dbl> <chr>   
## 1 Shannon no     yes       40    65       899 0.00808 **
wilcoxRar <- indices %>% 
  rstatix::wilcox_test(Rarefied ~ angling) %>%
  add_significance()
wilcoxRar
## # A tibble: 1 x 8
##   .y.      group1 group2    n1    n2 statistic     p p.signif
## * <chr>    <chr>  <chr>  <int> <int>     <dbl> <dbl> <chr>   
## 1 Rarefied no     yes       40    65     1074. 0.134 ns

Effect size

library(coin)
effRich <- indices %>% wilcox_effsize(Richness ~ angling)

effShan <- indices %>% wilcox_effsize(Shannon ~ angling)

effRar <- indices %>% wilcox_effsize(Rarefied ~ angling)

effRich
## # A tibble: 1 x 7
##   .y.      group1 group2 effsize    n1    n2 magnitude
## * <chr>    <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 Richness no     yes      0.257    40    65 small
effShan
## # A tibble: 1 x 7
##   .y.     group1 group2 effsize    n1    n2 magnitude
## * <chr>   <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 Shannon no     yes      0.259    40    65 small
effRar
## # A tibble: 1 x 7
##   .y.      group1 group2 effsize    n1    n2 magnitude
## * <chr>    <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 Rarefied no     yes      0.147    40    65 small

Report

The summary result paragraphs could be something like the ones below.

The interpretation I’d make is that there is likely a previous bias on the sites - fishermen started fishing at the prettiest sites, then those sites were officially allowed - but I’d need to be more familiar with the data to be able to interpret it properly. Anyway, the effect sizes are pretty small, there are many zeroes… in general, these analyses should be treated with care.

  • The median richness in sites were angling was not allowed was 4 (IQR = 3.25), whereas the median in angling sites was 6 (IQR = 6). The Wilcoxon test showed that the difference was significant (p < 0.01, effect size r = 0.26).

  • The median Shannon-Weiner diversity index in sites were angling was not allowed was 1.39 (IQR = 0.84), whereas the median in angling sites was 1.75 (IQR = 1.07). The Wilcoxon test showed that the difference was significant (p < 0.01, effect size r = 0.26).

  • The median Rarefied richness in sites were angling was not allowed was 4 (IQR = 1.97), whereas the median in angling sites was 4.52 (IQR = 1.72). The Wilcoxon test showed that the difference was not significant (p > 0.05, effect size r = 0.15).

Publication-ready plots

wilcoxRich <- wilcoxRich %>% add_xy_position(x = "angling")
bxpRich + 
  stat_pvalue_manual(wilcoxRich, tip.length = 0) +
  labs(subtitle = get_test_label(wilcoxRich, detailed = TRUE))

wilcoxShan <- wilcoxShan %>% add_xy_position(x = "angling")
bxpShan + 
  stat_pvalue_manual(wilcoxShan, tip.length = 0) +
  labs(subtitle = get_test_label(wilcoxShan, detailed = TRUE))

wilcoxRar <- wilcoxRar %>% add_xy_position(x = "angling")
bxpRar + 
  stat_pvalue_manual(wilcoxRar, tip.length = 0) +
  labs(subtitle = get_test_label(wilcoxRar, detailed = TRUE))