Introduction

This document is to analyze the vegetation-mammal community dataset.

library(ggplot2)
library(dplyr)
library(vegan)
library(rstatix)
library(ggpubr)

setwd("C:/GitHub Projects/enec-562/Final Project")
mam <- read.csv("master.avg.csv")

mam <- mam %>% filter(rowSums(select(.,7:13) != 0) > 0) %>%
    select(-avg.rara)

#some sites got 0, must excludec

Select and check data

Select only count data:

mam1 <- mam[,23:29] #exclude the columns that contains extra information that is not mammal count

Check for skewness and kurtosis:

library(moments) #load moments package to check for kurtosis and skewness
skewness(mam1)

## avg.pele avg.sihi avg.rehu avg.orpa avg.mumu avg.mipi avg.mipe 
## 1.215448 3.480495 3.245557 4.409845 9.504290 6.276279 4.535420

kurtosis(mam1)

##   avg.pele   avg.sihi   avg.rehu   avg.orpa   avg.mumu   avg.mipi   avg.mipe 
##   4.701003  15.475685  13.212622  20.446729 112.557216  43.568847  26.274031

As expect, count data are highly skewed. NMDS is the best!

Plot the results

anh_theme <- function() {
  theme(axis.text.x = element_text(hjust = 1)) +
    theme(axis.text.y = element_text(colour = "black", size = 10, face = "bold"), 
          axis.text.x = element_text(colour = "black", face = "bold", size = 10), 
          legend.text = element_text(size = 10, face ="bold", colour ="black"), 
          legend.position = "right", axis.title.y = element_text(face = "bold", size = 7), 
          axis.title.x = element_text(face = "bold", size = 12, colour = "black", margin = margin(t=5)), 
          axis.title.y.left = element_text(face = "bold", size = 12, colour = "black",margin=margin(r=5)),
          legend.title = element_text(size = 10, colour = "black", face = "bold"), 
          panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
          legend.key=element_blank(),
          plot.title = element_text(color = "black", size = 18, face = "bold", hjust = 0.5))
}

#import data
setwd("C:/GitHub Projects/enec-562/Final Project")

library(readxl)
mam.scores <- read_excel("mam.scores.xlsx")

ggplot(mam.scores, aes(x = NMDS1, y = NMDS2)) + 
    geom_point(size = 2,aes( color = Restored)) + # geom_polygon(data=data.scores.dplyr,aes(x=NMDS1,y=NMDS2,group=Subwatershed),alpha=0.30); this is to add a little convex polygon to visualize the clusters better. You can try to see how it looks. 
    scale_color_manual(values = c("red","green"))

    # scale_shape_manual(values = c(18, 19,20,21,22,23,24, 25,26)) + anh_theme()

From the look of the graph and investigation of data, restored sites seem to be very homogeneous, whereas there is much variability in composition at non-restored sites.

Let’s try MRPP.

MRPP of different groups

Restored

mam_restored <- mam1 %>% mutate(Restored = mam$Restored)
# Run MRPP
restored_mrpp <- mrpp(vegdist(mam_restored[,-8], method="bray"), mam_restored$Restored, 
                    distance = "bray", permutations = 999)

# Print the results
print(restored_mrpp)

## 
## Call:
## mrpp(dat = vegdist(mam_restored[, -8], method = "bray"), grouping = mam_restored$Restored,      permutations = 999, distance = "bray") 
## 
## Dissimilarity index: bray 
## Weights for groups:  n 
## 
## Class means and counts:
## 
##       No     Yes   
## delta 0.5574 0.2474
## n     157    67    
## 
## Chance corrected within-group agreement A: 0.05813 
## Based on observed delta 0.4647 and expected delta 0.4934 
## 
## Significance of delta: 0.001 
## Permutation: free
## Number of permutations: 999

MRPP, or Multi Response Permutation Procedure, is a statistical test used to assess whether there are significant differences between groups of sampling units based on a dissimilarity matrix calculated from multiple variables. MRPP is also great to use when data violate assumptions of normality or homogeneity of variances required by traditional parametric tests.

Results show that restored and non-restored sites have significantly different composition of mammals (p = 0.001).

Site

mam_site <- mam1 %>% mutate(Site = mam$Site)
# Run MRPP
site_mrpp <- mrpp(vegdist(mam_site[,-8], method="bray"), mam_site$Site, 
                    distance = "bray", permutations = 999)

# Print the results
print(site_mrpp)

## 
## Call:
## mrpp(dat = vegdist(mam_site[, -8], method = "bray"), grouping = mam_site$Site,      permutations = 999, distance = "bray") 
## 
## Dissimilarity index: bray 
## Weights for groups:  n 
## 
## Class means and counts:
## 
##       A      B      C      CF1    CF2    CF3    D      E     
## delta 0.5693 0.4585 0.3608 0.2767 0.2144 0.2571 0.7232 0.3039
## n     31     35     33     24     18     25     31     27    
## 
## Chance corrected within-group agreement A: 0.1571 
## Based on observed delta 0.4159 and expected delta 0.4934 
## 
## Significance of delta: 0.001 
## Permutation: free
## Number of permutations: 999

Results show that traps at different sites have significantly different composition of mammals (p = 0.001).

Community diversity indices comparison

Shannon index

“The index takes into account the number of species living in a habitat (richness) and their relative abundance (evenness).”

mam_shannon = as.data.frame(diversity(mam1, index = "shannon")) %>% 
  mutate(Restored = mam$Restored)
head(mam_shannon)

##   diversity(mam1, index = "shannon") Restored
## 1                          0.0000000       No
## 2                          0.0000000       No
## 3                          0.6931472       No
## 4                          0.6362829       No
## 5                          0.6362829       No
## 6                          0.0000000       No

Test for Shannon index between restored and non-restored sites

mam_shannon = mam_shannon %>% mutate(log.shannon = log(mam_shannon$`diversity(mam1, index = "shannon")`)) %>% rename("shannon" = `diversity(mam1, index = "shannon")`)

wilcox.test(shannon ~ Restored, data=mam_shannon)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  shannon by Restored
## W = 6834, p-value = 6.303e-07
## alternative hypothesis: true location shift is not equal to 0

Simpson index

“The Shannon index stresses the richness component and rare cover types, whilst the Simpson index lays greater emphasis on the evenness component and on the dominant cover types (McGarigal and Marks, 1994, Haines-Young and Chopping, 1996, Riitters, Wickham, Vogelmann and Jones, 2000).”

Simpson gives more weight to dominant and common species.

mam_simpson = as.data.frame(diversity(mam1, index = "simpson")) %>% 
  mutate(Restored = mam$Restored)
head(mam_simpson)

##   diversity(mam1, index = "simpson") Restored
## 1                           0.000000       No
## 2                           0.000000       No
## 3                           0.500000       No
## 4                           0.444222       No
## 5                           0.444222       No
## 6                           0.000000       No

mam_simpson = mam_simpson %>% rename("simpson" = `diversity(mam1, index = "simpson")`)

wilcox.test(simpson ~ Restored, data=mam_simpson)

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  simpson by Restored
## W = 6834, p-value = 6.303e-07
## alternative hypothesis: true location shift is not equal to 0

Mammal Vegetation Community Analysis

Nguyen Tien Anh Quach

2024-04-16