This tutorial will go over the basics skill of re-configuring dataframes so you can use them when R functions require specific Dataframe requirements. One of the most common reconfiguration needs you will come across in ecology is changing a Dataframe from “long” format to “wide” format. This is most commonly needed when data is needed in a matrix for analysis.
tidyr!To reconfigure out dataframes we will be using packages that are part of tidyverse. You should start by installing tidyverse using install.packages("tidyverse") and loading the tidyr packages as seen in the code below.
# Load packages
library("tidyr")Be sure to set your working directory! HINT -> setwd()
For purposes of this example, we will be taking community data in long format and converting to matrix formatted community data. To start, data should consist of plot, species, and cover (abundance) columns. Our data is called nutnet.spp.long Below is an example of how data should look when it exists in long form:
| Plot | Species | Cover |
|---|---|---|
| 1 | Spartina_patens | 28 |
| 1 | Ammophila_breviligulata | 24 |
| 1 | Setaria_parviflora | 10 |
| 1 | Cyperus_esculentes | 20 |
| 1 | Panicum_amarum | 7 |
| 1 | Conyza_canadensis | 1 |
| 1 | Solidago_sempervirens | 3 |
| 1 | Gnaphalium_purpureum | 1 |
First determine the name of your new dataframe. Here, we name our new dataframe nutnet.spp.mat because we are making a matrix. Below we will use the tidyr function called spread. We will use nutnet.spp.long as the input dataframe and will identify our “key” (Species) and “value” (Cover) columns – let tidyr do the rest!
#Converting long data to wide data - MATRIX STYLE
nutnet.spp.mat <- # Name of new df
spread(nutnet.spp.long, # Converts input df from long to wide
Species, # This is our Key column
Cover) # This is our Value column| Plot | Ammophila_breviligulata | Andropogon_virginicus | Chamaesyce_maculata | Conyza_canadensis |
|---|---|---|---|---|
| 1 | 24 | NA | NA | 1 |
| 2 | 10 | 3 | NA | 10 |
| 3 | 17 | 8 | NA | 3 |
| 4 | 27 | 11 | NA | 10 |
| 5 | 20 | 5 | 1 | 5 |
| 6 | 15 | NA | NA | 1 |
| 7 | 15 | 15 | NA | 2 |
| 8 | 25 | 16 | NA | 1 |
You will probably notice that if you did not include 0’s in your long form data then R has filled in all of those columns with NA’s. This may or may not be something that personally bothers you…BUT it could affect subsequent analysis that you may want to use this new matrix for. So, our next step will be to tell R that we want to fill in all the NA’s with 0’s – using tidyr!
It is important to note that you can coerce R to replace NA’s with anything you’d like, but here we want 0’s!
tidyrFor this step in the data organization process I will be using piping, which is an operator (%<%) in the tidyverse that aims to make code writing easier. All piping does is takes the output of one statement and makes it the input of the very next statement. Here, I am going to use piping to take our input dataframe (nutnet.spp.mat), replace NA’s with 0’s and then direct it to a function for writing a .csv in our working directory of our new matrix.
In the full code chuck at the end of the tutorial I will use piping through the whole process of long -> wide w/ NA’s -> wide w/ 0’s -> writing .csv.
# Replacing NAs with 0s
nutnet.spp.mat %>%
replace(is.na(.), "0") %>% # Specifies "0" as replacement of "NA"
write.csv("nutnet_spp_matrix_2017.csv") # Writes .csv file to directory| Plot | Ammophila_breviligulata | Andropogon_virginicus | Chamaesyce_maculata | Conyza_canadensis |
|---|---|---|---|---|
| 1 | 24 | 0 | 0 | 1 |
| 2 | 10 | 3 | 0 | 10 |
| 3 | 17 | 8 | 0 | 3 |
| 4 | 27 | 11 | 0 | 10 |
| 5 | 20 | 5 | 1 | 5 |
| 6 | 15 | 0 | 0 | 1 |
| 7 | 15 | 15 | 0 | 2 |
| 8 | 25 | 16 | 0 | 1 |
# This is our imported df
nutnet.spp.long <- read.csv("nutnet_2017_spp_dat.csv")
# Converting long data to wide data - MATRIX STYLE
nutnet.spp.mat <- # Name of new df
spread(nutnet.spp.long, # Converts input df from long to wide
Species, # This is our Key column
Cover) %>% # This is our Value column
replace(is.na(.), "0") %>% # Specifies "0" as replacement of "NA"
write.csv("nutnet_spp_matrix_2017.csv") # Writes .csv file to directory