Introduction

This tutorial will go over the basics skill of re-configuring dataframes so you can use them when R functions require specific Dataframe requirements. One of the most common reconfiguration needs you will come across in ecology is changing a Dataframe from “long” format to “wide” format. This is most commonly needed when data is needed in a matrix for analysis.

We need tidyr!

To reconfigure out dataframes we will be using packages that are part of tidyverse. You should start by installing tidyverse using install.packages("tidyverse") and loading the tidyr packages as seen in the code below.

# Load packages
library("tidyr")

Be sure to set your working directory! HINT -> setwd()

Bring in “long” form data

For purposes of this example, we will be taking community data in long format and converting to matrix formatted community data. To start, data should consist of plot, species, and cover (abundance) columns. Our data is called nutnet.spp.long Below is an example of how data should look when it exists in long form:

Ex. Long dataframe
Plot Species Cover
1 Spartina_patens 28
1 Ammophila_breviligulata 24
1 Setaria_parviflora 10
1 Cyperus_esculentes 20
1 Panicum_amarum 7
1 Conyza_canadensis 1
1 Solidago_sempervirens 3
1 Gnaphalium_purpureum 1

Coverting to Matrix (“wide” form data)

First determine the name of your new dataframe. Here, we name our new dataframe nutnet.spp.mat because we are making a matrix. Below we will use the tidyr function called spread. We will use nutnet.spp.long as the input dataframe and will identify our “key” (Species) and “value” (Cover) columns – let tidyr do the rest!

#Converting long data to wide data - MATRIX STYLE
nutnet.spp.mat <-           # Name of new df
  spread(nutnet.spp.long,   # Converts input df from long to wide
         Species,           # This is our Key column
         Cover)             # This is our Value column

This is what the resulting dataframe should look like:

Ex. Wide dataframe
Plot Ammophila_breviligulata Andropogon_virginicus Chamaesyce_maculata Conyza_canadensis
1 24 NA NA 1
2 10 3 NA 10
3 17 8 NA 3
4 27 11 NA 10
5 20 5 1 5
6 15 NA NA 1
7 15 15 NA 2
8 25 16 NA 1

You will probably notice that if you did not include 0’s in your long form data then R has filled in all of those columns with NA’s. This may or may not be something that personally bothers you…BUT it could affect subsequent analysis that you may want to use this new matrix for. So, our next step will be to tell R that we want to fill in all the NA’s with 0’s – using tidyr!

It is important to note that you can coerce R to replace NA’s with anything you’d like, but here we want 0’s!

Replacing NA’s with 0’s using tidyr

For this step in the data organization process I will be using piping, which is an operator (%<%) in the tidyverse that aims to make code writing easier. All piping does is takes the output of one statement and makes it the input of the very next statement. Here, I am going to use piping to take our input dataframe (nutnet.spp.mat), replace NA’s with 0’s and then direct it to a function for writing a .csv in our working directory of our new matrix.

In the full code chuck at the end of the tutorial I will use piping through the whole process of long -> wide w/ NA’s -> wide w/ 0’s -> writing .csv.

# Replacing NAs with 0s
nutnet.spp.mat %>% 
  replace(is.na(.), "0") %>%   # Specifies "0" as replacement of "NA"
  write.csv("nutnet_spp_matrix_2017.csv") # Writes .csv file to directory

This is what the resulting dataframe should look like:

Ex. Wide dataframe w/ 0’s
Plot Ammophila_breviligulata Andropogon_virginicus Chamaesyce_maculata Conyza_canadensis
1 24 0 0 1
2 10 3 0 10
3 17 8 0 3
4 27 11 0 10
5 20 5 1 5
6 15 0 0 1
7 15 15 0 2
8 25 16 0 1

Now Try with your data!

# This is our imported df
nutnet.spp.long <- read.csv("nutnet_2017_spp_dat.csv")
# Converting long data to wide data - MATRIX STYLE
nutnet.spp.mat <-           # Name of new df
  spread(nutnet.spp.long,   # Converts input df from long to wide
         Species,           # This is our Key column
         Cover) %>%             # This is our Value column  
  replace(is.na(.), "0") %>%   # Specifies "0" as replacement of "NA"
  write.csv("nutnet_spp_matrix_2017.csv") # Writes .csv file to directory