This document is to analyze the vegetation-mammal community dataset.
library(ggplot2)
library(dplyr)
library(vegan)
library(rstatix)
library(ggpubr)
setwd("C:/GitHub Projects/enec-562/Final Project")
mam <- read.csv("master.avg.csv")
mam <- mam %>% filter(rowSums(select(.,7:13) != 0) > 0) %>%
select(-avg.rara)
#some sites got 0, must excludec
Select only count data:
mam1 <- mam[,23:29] #exclude the columns that contains extra information that is not mammal count
Check for skewness and kurtosis:
library(moments) #load moments package to check for kurtosis and skewness
skewness(mam1)
## avg.pele avg.sihi avg.rehu avg.orpa avg.mumu avg.mipi avg.mipe
## 1.215448 3.480495 3.245557 4.409845 9.504290 6.276279 4.535420
kurtosis(mam1)
## avg.pele avg.sihi avg.rehu avg.orpa avg.mumu avg.mipi avg.mipe
## 4.701003 15.475685 13.212622 20.446729 112.557216 43.568847 26.274031
As expect, count data are highly skewed. NMDS is the best!
anh_theme <- function() {
theme(axis.text.x = element_text(hjust = 1)) +
theme(axis.text.y = element_text(colour = "black", size = 10, face = "bold"),
axis.text.x = element_text(colour = "black", face = "bold", size = 10),
legend.text = element_text(size = 10, face ="bold", colour ="black"),
legend.position = "right", axis.title.y = element_text(face = "bold", size = 7),
axis.title.x = element_text(face = "bold", size = 12, colour = "black", margin = margin(t=5)),
axis.title.y.left = element_text(face = "bold", size = 12, colour = "black",margin=margin(r=5)),
legend.title = element_text(size = 10, colour = "black", face = "bold"),
panel.background = element_blank(), panel.border = element_rect(colour = "black", fill = NA, size = 1.2),
legend.key=element_blank(),
plot.title = element_text(color = "black", size = 18, face = "bold", hjust = 0.5))
}
#import data
setwd("C:/GitHub Projects/enec-562/Final Project")
library(readxl)
mam.scores <- read_excel("mam.scores.xlsx")
ggplot(mam.scores, aes(x = NMDS1, y = NMDS2)) +
geom_point(size = 2,aes( color = Restored)) + # geom_polygon(data=data.scores.dplyr,aes(x=NMDS1,y=NMDS2,group=Subwatershed),alpha=0.30); this is to add a little convex polygon to visualize the clusters better. You can try to see how it looks.
scale_color_manual(values = c("red","green"))
# scale_shape_manual(values = c(18, 19,20,21,22,23,24, 25,26)) + anh_theme()
From the look of the graph and investigation of data, restored sites seem to be very homogeneous, whereas there is much variability in composition at non-restored sites.
Let’s try MRPP.
mam_restored <- mam1 %>% mutate(Restored = mam$Restored)
# Run MRPP
restored_mrpp <- mrpp(vegdist(mam_restored[,-8], method="bray"), mam_restored$Restored,
distance = "bray", permutations = 999)
# Print the results
print(restored_mrpp)
##
## Call:
## mrpp(dat = vegdist(mam_restored[, -8], method = "bray"), grouping = mam_restored$Restored, permutations = 999, distance = "bray")
##
## Dissimilarity index: bray
## Weights for groups: n
##
## Class means and counts:
##
## No Yes
## delta 0.5574 0.2474
## n 157 67
##
## Chance corrected within-group agreement A: 0.05813
## Based on observed delta 0.4647 and expected delta 0.4934
##
## Significance of delta: 0.001
## Permutation: free
## Number of permutations: 999
MRPP, or Multi Response Permutation Procedure, is a statistical test used to assess whether there are significant differences between groups of sampling units based on a dissimilarity matrix calculated from multiple variables. MRPP is also great to use when data violate assumptions of normality or homogeneity of variances required by traditional parametric tests.
Results show that restored and non-restored sites have significantly different composition of mammals (p = 0.001).
mam_site <- mam1 %>% mutate(Site = mam$Site)
# Run MRPP
site_mrpp <- mrpp(vegdist(mam_site[,-8], method="bray"), mam_site$Site,
distance = "bray", permutations = 999)
# Print the results
print(site_mrpp)
##
## Call:
## mrpp(dat = vegdist(mam_site[, -8], method = "bray"), grouping = mam_site$Site, permutations = 999, distance = "bray")
##
## Dissimilarity index: bray
## Weights for groups: n
##
## Class means and counts:
##
## A B C CF1 CF2 CF3 D E
## delta 0.5693 0.4585 0.3608 0.2767 0.2144 0.2571 0.7232 0.3039
## n 31 35 33 24 18 25 31 27
##
## Chance corrected within-group agreement A: 0.1571
## Based on observed delta 0.4159 and expected delta 0.4934
##
## Significance of delta: 0.001
## Permutation: free
## Number of permutations: 999
Results show that traps at different sites have significantly different composition of mammals (p = 0.001).
“The index takes into account the number of species living in a habitat (richness) and their relative abundance (evenness).”
mam_shannon = as.data.frame(diversity(mam1, index = "shannon")) %>%
mutate(Restored = mam$Restored)
head(mam_shannon)
## diversity(mam1, index = "shannon") Restored
## 1 0.0000000 No
## 2 0.0000000 No
## 3 0.6931472 No
## 4 0.6362829 No
## 5 0.6362829 No
## 6 0.0000000 No
mam_shannon = mam_shannon %>% mutate(log.shannon = log(mam_shannon$`diversity(mam1, index = "shannon")`)) %>% rename("shannon" = `diversity(mam1, index = "shannon")`)
wilcox.test(shannon ~ Restored, data=mam_shannon)
##
## Wilcoxon rank sum test with continuity correction
##
## data: shannon by Restored
## W = 6834, p-value = 6.303e-07
## alternative hypothesis: true location shift is not equal to 0
“The Shannon index stresses the richness component and rare cover types, whilst the Simpson index lays greater emphasis on the evenness component and on the dominant cover types (McGarigal and Marks, 1994, Haines-Young and Chopping, 1996, Riitters, Wickham, Vogelmann and Jones, 2000).”
Simpson gives more weight to dominant and common species.
mam_simpson = as.data.frame(diversity(mam1, index = "simpson")) %>%
mutate(Restored = mam$Restored)
head(mam_simpson)
## diversity(mam1, index = "simpson") Restored
## 1 0.000000 No
## 2 0.000000 No
## 3 0.500000 No
## 4 0.444222 No
## 5 0.444222 No
## 6 0.000000 No
mam_simpson = mam_simpson %>% rename("simpson" = `diversity(mam1, index = "simpson")`)
wilcox.test(simpson ~ Restored, data=mam_simpson)
##
## Wilcoxon rank sum test with continuity correction
##
## data: simpson by Restored
## W = 6834, p-value = 6.303e-07
## alternative hypothesis: true location shift is not equal to 0