Introduction

This tutorial will go over the basics skill of re-configuring dataframes so you can use them when R functions require specific dataframe format requirements. One of the most common reconfiguration needs you will come across in ecology is changing a dataframe from “wide” format to “long” format. This is most commonly used when you have a dataset in a matrix format.

We need tidyr - and dplyr for good measure!

To reconfigure out dataframes we will be using packages that are part of tidyverse. You should start by installing tidyverse using install.packages("tidyverse") and loading the tidyr packages as seen in the code below.

# Load packages
library("tidyr")
library("dplyr")

Be sure to set your working directory! HINT -> setwd()

Bring in your “long” form data

You can think of the following data as a dummy dataset!

For purposes of this example, we will be taking community data in “wide”" (or matrix) format and converting to “long” formatted community data. To start, data should consist of plot and species columns (if you have multiple years or replicates in your data - they should be included as columns as well). In this example data, each row represents a plot number with abundance of each species (% cover). The example dataset is called nutnet.spp.comp Below is an example of how data should look while its in wide form:

Ex. Wide dataframe
Plot Ammophila_breviligulata Andropogon_virginicus Chamaesyce_maculata Conyza_canadensis
1 24 0 0 1
2 10 3 0 10
3 17 8 0 3
4 27 11 0 10
5 20 5 1 5
6 15 0 0 1
7 15 15 0 2
8 25 16 0 1

Coverting to “wide” form data

First, create a name for your new dataframe. Here, we name our new dataframe nutnet.spp.long. Below we will use the tidyr function called gather(). We will use nutnet.spp.comp as the input dataframe and will identify our “key” (Species) and “value” (Cover) columns. It is important to indicate that you do NOT want to include your “Plot” variable in the dataframe conversion. We do this by finishing the code off with -c() - in this last piece we will tell R which columns should not be included in the dataframe conversion. Once you indicate all this information, click Ctrl+Enter and let tidyr do the rest!

#Converting matrix data to long data
nutnet.spp.long <-    # Name of our new df
  gather(nutnet.spp.mat, # Indicate function and old data
         Species,        # This is your "Key"
         Cover,          # This is your "Value"
         -c(Plot))       # This tells R not to include your "Plot" column as a "Species"  

This is what the resulting dataframe should look like:

Ex. Long dataframe
Plot Species Cover
1 Ammophila_breviligulata 24
2 Ammophila_breviligulata 10
3 Ammophila_breviligulata 17
4 Ammophila_breviligulata 27
5 Ammophila_breviligulata 20
6 Ammophila_breviligulata 15
7 Ammophila_breviligulata 15
8 Ammophila_breviligulata 25

You should find that your new dataframe now has Plot, Species, and Cover as variable in your dataframe with plot numbers repeating for each species in your dataset!

Now Try with your data!

# This is our imported df
nutnet.spp.mat <- read.csv("nutnet_spp_matrix_2017.csv")
# Converting long data to wide data - MATRIX STYLE
nutnet.spp.long <- nutnet.spp.long <-    # Name of our new df
  gather(nutnet.spp.mat, # Indicate function and old data
         Species,        # This is your "Key"
         Cover,          # This is your "Value"
         -c(Plot))       # This tells R not to include your "Plot" column as a "Species"
 write.csv(nutnet.spp.long, "nutnet_spp_long.csv")