Introduction to R and Diversity Indices

Introduction to R

Before we begin, make sure you can access R on your local computer and have access to the data we collected at MOSS. This needs to be available locally on your computer (locally means it is not hosted online but is in a folder on the computer you are using).

R is a statistical computing programming language. It is probably the most popular program language right now (2019) in Bio/Wildlife/Stats.

It is an object based programming language, meaning that we create data ‘objects’ which we then apply functions on to which change the object into a new object. This is much more intuitive than many other classical languages which makes it a great introductory programming language for students. Here is how we can create objects (grey box is code and white box is output, anything after a # is a comment which R does not act on):

this_is_an_object<-1

this_is_an_object

## [1] 1

1. I made an object called ‘this_is_an_object’ and assigned it the value of 1. If I enter the name of the object, I get all the information contained in that object. We can make more complicated objects that contain multiple values, for example:

this_is_a_longer_object<-300:302

this_is_a_longer_object

## [1] 300 301 302

2. Or contain words which R calls strings.

this_is_a_longer_object_with_words<-"Hello World!"

this_is_a_longer_object_with_words

## [1] "Hello World!"

this_is_a_longer_object_with_multiple_words<- c("Hello World!", "Goodbye World!", "Hello Satan!")

this_is_a_longer_object_with_multiple_words

## [1] "Hello World!"   "Goodbye World!" "Hello Satan!"

3. We can also refer to indivudal parts of the object such as “Goodbye World!” without having to call the rest of the data in the object.

this_is_a_longer_object_with_multiple_words[2]

## [1] "Goodbye World!"

this_is_a_longer_object_with_multiple_words[1]

## [1] "Hello World!"

this_is_a_longer_object_with_multiple_words[3]

## [1] "Hello Satan!"

4. Usually data objects contain more than just a list of phrases or numbers but contain both organized in rows and columns. Let’s make one from our current created objects but rename them first so they are not so long.

numbers<-this_is_a_longer_object

phrases<-this_is_a_longer_object_with_multiple_words

test_data<-data.frame(numbers,phrases)

test_data

##   numbers        phrases
## 1     300   Hello World!
## 2     301 Goodbye World!
## 3     302   Hello Satan!

5. We can call the individual column elements from the dataframe by referencing the name of the column with the $ sign or by referencing the column number. We can call rows by referencing them by name in the same manner as columns (if they have one) or by number. We can also add new columns to the dataframe:

#Columns
test_data$phrases

## [1] Hello World!   Goodbye World! Hello Satan!  
## Levels: Goodbye World! Hello Satan! Hello World!

test_data[,2]

## [1] Hello World!   Goodbye World! Hello Satan!  
## Levels: Goodbye World! Hello Satan! Hello World!

#Rows
test_data[3,]

##   numbers      phrases
## 3     302 Hello Satan!

#Calling a specific cell
test_data[3,1]

## [1] 302

#Adding new column

test_data$New<-numbers
test_data$New

## [1] 300 301 302

6. We now know how data is organized in dataframes in R. Now, lets see about loading some real complicated ecological data. First, we need to load into our environment some other code (the environment is the current sessions of R code that is available). Enter the following into your command line removing the comments:

#install.packages("vegan")
library(vegan)

## Loading required package: permute

## Loading required package: lattice

## This is vegan 2.5-6

data(BCI)

head(BCI[,1:5])

##   Abarema.macradenia Vachellia.melanoceras Acalypha.diversifolia
## 1                  0                     0                     0
## 2                  0                     0                     0
## 3                  0                     0                     0
## 4                  0                     0                     0
## 5                  0                     0                     0
## 6                  0                     0                     0
##   Acalypha.macrostachya Adelia.triloba
## 1                     0              0
## 2                     0              0
## 3                     0              0
## 4                     0              3
## 5                     0              1
## 6                     0              0

# This data is derived from a previous study. 
# We are only looking at the first five columns of the data and 
# only the first six rows. What are the rows? What are the columns?

nrow(BCI)

## [1] 50

ncol(BCI)

## [1] 225

7. We can use some functions in the Vegan package to calculate many diversity indices such as Shannon’s, Simpson’s, or Inverted Simpson. Let’s look at the top 6 transects and summarize across the whole dataset.

#install.packages("vegan")
Shannon_results<-diversity(BCI,"shannon")
Simpson_results<-diversity(BCI,"simpson")
Inv_results<-diversity(BCI,"inv")

head(Shannon_results)

##        1        2        3        4        5        6 
## 4.018412 3.848471 3.814060 3.976563 3.969940 3.776575

summary(Shannon_results)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   2.642   3.744   3.850   3.821   3.969   4.077

# What does this tell you about the data?

8. Let’s plot some of the data

d <- density(Shannon_results)
plot(d)

hist(Shannon_results)

# What does this tell you about the transects?

9. Now, lets try to do the same thing we did with the example data with your transect level data. This may require us to do some reorganization of our data.

Introduction to R and Diversity Indices

Introduction to R

Including Plots