tell us
separate() is part of the dplyr package that is within tidyverse. separate() allows you to separate variables within one column into new columns. It might be useful when separating two or more variables that are clumped together within one column.
show us
The first step is to install your packages. Here we instill the tidyverse package, and the palmerpenguins package for our data.
install and load packages
library(tidyverse)## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.2 v dplyr 1.0.6
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(tidyr)
library(palmerpenguins)get some data
Here we load the data from the palmerpenguins package.
We use the print function to view a summary of the data.
data(package = 'palmerpenguins')
print(penguins_raw)## # A tibble: 344 x 17
## studyName `Sample Number` Species Region Island Stage `Individual ID`
## <chr> <dbl> <chr> <chr> <chr> <chr> <chr>
## 1 PAL0708 1 Adelie Pengu~ Anvers Torge~ Adult,~ N1A1
## 2 PAL0708 2 Adelie Pengu~ Anvers Torge~ Adult,~ N1A2
## 3 PAL0708 3 Adelie Pengu~ Anvers Torge~ Adult,~ N2A1
## 4 PAL0708 4 Adelie Pengu~ Anvers Torge~ Adult,~ N2A2
## 5 PAL0708 5 Adelie Pengu~ Anvers Torge~ Adult,~ N3A1
## 6 PAL0708 6 Adelie Pengu~ Anvers Torge~ Adult,~ N3A2
## 7 PAL0708 7 Adelie Pengu~ Anvers Torge~ Adult,~ N4A1
## 8 PAL0708 8 Adelie Pengu~ Anvers Torge~ Adult,~ N4A2
## 9 PAL0708 9 Adelie Pengu~ Anvers Torge~ Adult,~ N5A1
## 10 PAL0708 10 Adelie Pengu~ Anvers Torge~ Adult,~ N5A2
## # ... with 334 more rows, and 10 more variables: Clutch Completion <chr>,
## # Date Egg <date>, Culmen Length (mm) <dbl>, Culmen Depth (mm) <dbl>,
## # Flipper Length (mm) <dbl>, Body Mass (g) <dbl>, Sex <chr>,
## # Delta 15 N (o/oo) <dbl>, Delta 13 C (o/oo) <dbl>, Comments <chr>
As you can see, the “stage variable” has two types of information: what stage of life the penguin is in, and egg stages.
We want to split this information into two columns.
use the function
This is where we can use the separate function!
To use this function, you need to indicate what column you want to separate. You then use “into = c()” to indicate what the new labels of your new columns will be. Finally, you use “sep”, to indicate how the information will be separated.
Here we indicate the data will be separated using a comma “,”. Therefore all the information on one side of the comma will be put under “stage” and the information on the right will be put under “egg stage”.
penguins_raw %>%
separate(Stage, into = c("Stage", "Egg stage"), sep = ",") ## # A tibble: 344 x 18
## studyName `Sample Number` Species Region Island Stage `Egg stage`
## <chr> <dbl> <chr> <chr> <chr> <chr> <chr>
## 1 PAL0708 1 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## 2 PAL0708 2 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## 3 PAL0708 3 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## 4 PAL0708 4 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## 5 PAL0708 5 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## 6 PAL0708 6 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## 7 PAL0708 7 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## 8 PAL0708 8 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## 9 PAL0708 9 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## 10 PAL0708 10 Adelie Penguin (~ Anvers Torger~ Adult " 1 Egg Sta~
## # ... with 334 more rows, and 11 more variables: Individual ID <chr>,
## # Clutch Completion <chr>, Date Egg <date>, Culmen Length (mm) <dbl>,
## # Culmen Depth (mm) <dbl>, Flipper Length (mm) <dbl>, Body Mass (g) <dbl>,
## # Sex <chr>, Delta 15 N (o/oo) <dbl>, Delta 13 C (o/oo) <dbl>, Comments <chr>
more resources
We found out how to use the separate function by googling “tidyr separate function blog”.
Some useful links included:
https://blog.rstudio.com/2014/07/22/introducing-tidyr/
https://tidyr.tidyverse.org/reference/separate.html
Both links show examples of how to use the function! This was super helpful in aiding us in making our own tutorial.