Visit: https://www.geocaceres.com/
—————————————————————————————————————————————————————————————————

GETTING STARTED WITH R AND RSTUDIO

Coding Club Workshop 1 - R Basics:
Learning how to import and explore data, and make graphs about Edinburgh’s biodiversity
Written by Gergana Daskalova 06/11/2016 University of Edinburgh, last updated 28th March 2019
Transcribed by Carlos Caceres 01/04/2020 National University from Colombia
Take From: https://ourcodingclub.github.io/tutorials/intro-to-r/


1.Begin to write the script

Loading the packages dplyr by use the filter () function and assign the working directory

#If the dplyr package isn´t installed: install.package("dplyr"). 
#Note that there are quotation marks when installing a package 
library(dplyr) # Note that there aren´t quotation marks when when loading a package
#Assign the working directory
setwd("C:/Users/GEOMATICS/PR/Getting_Started")#Remember, in R should use forward slashes ("C:/folder/data")

2.Import and check data

Importing Edinburgh Biodiversity Data. You can find all the files needed in the Github repository: https://github.com/ourcodingclub/CC-RBasics

edidiv <- read.csv("C:/Users/GEOMATICS/PR/Getting_Started/CC-RBasics-master/edidiv.csv")
View(edidiv)
#Check that your data was imported without any mistakes
head(edidiv)    # Displays the first few rows
##                      organisationName gridReference year         taxonName
## 1 Joint Nature Conservation Committee      NT265775 2000    Sterna hirundo
## 2 Joint Nature Conservation Committee      NT235775 2000    Sterna hirundo
## 3 Joint Nature Conservation Committee      NT235775 2000 Sterna paradisaea
## 4       British Trust for Ornithology          NT27 2000 Branta canadensis
## 5       British Trust for Ornithology          NT27 2000  Branta leucopsis
## 6     The Wildlife Information Centre         NT27S 2001     Turdus merula
##   taxonGroup
## 1       Bird
## 2       Bird
## 3       Bird
## 4       Bird
## 5       Bird
## 6       Bird
tail(edidiv)    # Displays the last rows
##                            organisationName gridReference year
## 25679                    The Mammal Society      NT278745 2016
## 25680                    The Mammal Society      NT277724 2016
## 25681                    The Mammal Society      NT266728 2016
## 25682                    The Mammal Society      NT270728 2016
## 25683                    The Mammal Society      NT257762 2016
## 25684 People's Trust for Endangered Species        NT2372 2016
##                   taxonName taxonGroup
## 25679  Sciurus carolinensis     Mammal
## 25680   Capreolus capreolus     Mammal
## 25681  Sciurus carolinensis     Mammal
## 25682 Oryctolagus cuniculus     Mammal
## 25683         Vulpes vulpes     Mammal
## 25684   Erinaceus europaeus     Mammal
str(edidiv)     # Tells you whether the variables are continuous, integers, categorical or characters
## 'data.frame':    25684 obs. of  5 variables:
##  $ organisationName: Factor w/ 28 levels "BATS & The Millennium Link",..: 14 14 14 8 8 28 28 28 28 28 ...
##  $ gridReference   : Factor w/ 1938 levels "NT200701","NT200712",..: 1314 569 569 1412 1412 1671 1671 1671 1671 1671 ...
##  $ year            : int  2000 2000 2000 2000 2000 2001 2001 2001 2001 2001 ...
##  $ taxonName       : Factor w/ 1275 levels "Acarospora fuscata",..: 1126 1126 1127 192 193 1202 365 977 472 947 ...
##  $ taxonGroup      : Factor w/ 11 levels "Beetle","Bird",..: 2 2 2 2 2 2 2 2 2 2 ...
#The taxonGroup variable shows as a character variable, but it should be a factor (categorical variable) 
#So we'll force it to be factor
head(edidiv$taxonGroup)     # Displays the first few rows of taxonGroup column only
## [1] Bird Bird Bird Bird Bird Bird
## 11 Levels: Beetle Bird Butterfly Dragonfly Flowering.Plants ... Mollusc
class(edidiv$taxonGroup)    # Tells you what type of variable we're dealing
## [1] "factor"
edidiv$taxonGroup <- as.factor(edidiv$taxonGroup) #This function turns whatever values you put inside into a factor

dim(edidiv)                 # Displays number of rows and columns
## [1] 25684     5
summary(edidiv)             # Gives you a summary of the data
##                                              organisationName gridReference  
##  Biological Records Centre                           :6744    NT2673 : 2741  
##  RSPB                                                :5809    NT2773 : 2031  
##  Butterfly Conservation                              :3000    NT2873 : 1247  
##  Scottish Wildlife Trust                             :2070    NT2570 : 1001  
##  Conchological Society of Great Britain &amp; Ireland:1998    NT27   :  888  
##  The Wildlife Information Centre                     :1860    NT2871 :  767  
##  (Other)                                             :4203    (Other):17009  
##       year                      taxonName                taxonGroup  
##  Min.   :2000   Maniola jurtina      : 1710   Butterfly       :9670  
##  1st Qu.:2006   Aphantopus hyperantus: 1468   Bird            :7366  
##  Median :2009   Turdus merula        : 1112   Flowering.Plants:2625  
##  Mean   :2009   Lycaena phlaeas      :  972   Mollusc         :2226  
##  3rd Qu.:2011   Aglais urticae       :  959   Hymenopteran    :1391  
##  Max.   :2016   Aglais io            :  720   Mammal          : 960  
##                 (Other)              :18743   (Other)         :1446
summary(edidiv$taxonGroup)  # Gives you a summary of that particular variable (column)
##           Beetle             Bird        Butterfly        Dragonfly 
##              426             7366             9670              421 
## Flowering.Plants           Fungus     Hymenopteran           Lichen 
##             2625              334             1391              140 
##        Liverwort           Mammal          Mollusc 
##              125              960             2226

The edidiv object has occurrence records of various species collected in Edinburgh from 2000 to 2016. To explore Edinburgh’s biodiversity, we will create a graph showing how many species were recorded in each taxonomic group. We will filter out the data for each taxon group and then count the unique species within it


3.Calculate species richness

#Rememeber install.packages("dplyr") and then load it using library(dplyr)
Beetle <- filter(edidiv, taxonGroup == "Beetle")
Bird <- filter(edidiv, taxonGroup == "Bird")
Butterfly <- filter(edidiv, taxonGroup == "Butterfly")
Dragonfly <- filter(edidiv, taxonGroup == "Dragonfly")
Flowering.Plants <- filter(edidiv, taxonGroup == "Flowering.Plants")
Fungus <- filter(edidiv, taxonGroup == "Fungus")
Hymenopteran <- filter(edidiv, taxonGroup == "Hymenopteran")
Lichen <- filter(edidiv, taxonGroup == "Lichen")
Liverwort <- filter(edidiv, taxonGroup == "Liverwort")
Mammal <- filter(edidiv, taxonGroup == "Mammal")
Mollusc <- filter(edidiv, taxonGroup == "Mollusc")

#Calculate we the number of different species in each group
#unique(), which identifies different species, and length(), which counts them
a <- length(unique(Beetle$taxonName))
b <- length(unique(Bird$taxonName))
c <- length(unique(Butterfly$taxonName))
d <- length(unique(Dragonfly$taxonName))
e <- length(unique(Flowering.Plants$taxonName))
f <- length(unique(Fungus$taxonName))
g <- length(unique(Hymenopteran$taxonName))
h <- length(unique(Lichen$taxonName))
i <- length(unique(Liverwort$taxonName))
j <- length(unique(Mammal$taxonName))
k <- length(unique(Mollusc$taxonName))

4. Create a vector and plot it

biodiv <- c(a,b,c,d,e,f,g,h,i,j,k)    # Combine all those object in one vector
names(biodiv) <- c("Beetle",          # Add labels
                   "Bird", 
                   "Butterfly", 
                   "Dragonfly", 
                   "Fl.Plants", 
                   "Fungus", 
                   "Hymenopteran", 
                   "Lichen", 
                   "Liverwort", 
                   "Mammal", 
                   "Mollusc")

barplot(biodiv,main="Species Richness") #Visualise species richness with the barplot() function

help(barplot)     # For help with the barplot() function
## starting httpd help server ... done
help(par)         # For help with plotting in general

#Save the plot
png("barplot.png", width=1600, height=600)  #Customise the size and resolution of the image
barplot(biodiv, xlab="Taxa", ylab="Number of species", ylim=c(0,600), cex.names= 1.5, cex.axis=1.5, cex.lab=1.5)
# The cex code increases the font size when greater than one (and decreases it when less than one).
dev.off() #close the diagram
## png 
##   2

5. Create a dataframe and plot it

Data frames are tables of values: they have a two-dimensional structure with rows and columns, where each column can have a different data type.

taxa <- c("Beetle",
          "Bird",
          "Butterfly",
          "Dragonfly",
          "Flowering.Plants",
          "Fungus",
          "Hymenopteran",
          "Lichen",
          "Liverwort",
          "Mammal",
          "Mollusc")

taxa_f <- factor(taxa) # Turning the object in a factor
richness <- c(a,b,c,d,e,f,g,h,i,j,k) # Combining all the values for the number of species in an object called richness
biodata <- data.frame(taxa_f, richness) # Create the data frame from the two vectors
write.csv(biodata, file="biodata.csv")  # Saving the file
View(biodata)

png("barplot2.png", width=1600, height=600)
barplot(biodata$richness, names.arg=c("Beetle",
                                      "Bird",
                                      "Butterfly",
                                      "Dragonfly",
                                      "Flowering.Plants",
                                      "Fungus",
                                      "Hymenopteran",
                                      "Lichen",
                                      "Liverwort",
                                      "Mammal",
                                      "Mollusc"),
        xlab="Taxa", ylab="Number of species", ylim=c(0,600))
dev.off()
## png 
##   2

Check out our Data Visualisation tutorial: https://ourcodingclub.github.io/tutorials/datavis/index.html