The goal of this tutorial is to separate one column into several columns. It can be done using the separate function from tidyr.
library(tidyr)
library(lubridate)
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date
# In this tutorial we will use the dataset of minimum temperature in melbourne
# https://datamarket.com/data/set/2324/daily-minimum-temperatures-in-melbourne-australia-1981-1990
Temperatures <- read.csv("daily-minimum-temperatures-in-me.csv", stringsAsFactors = FALSE)
head(Temperatures)
## Date Daily.minimum.temperatures.in.Melbourne..Australia..1981.1990
## 1 1981-01-01 20.7
## 2 1981-01-02 17.9
## 3 1981-01-03 18.8
## 4 1981-01-04 14.6
## 5 1981-01-05 15.8
## 6 1981-01-06 15.8
# We can see that the date has the format year - month - day
# We can then separate the date in several columns
# separate will find the pattern automatically
# We tell separate to split the column in three different variables called year, month and day
# The original column is replaced by the 3 new columns
Temperatures_2 <- separate(Temperatures, Date, c("Year", "Month", "Day"))
head(Temperatures_2)
## Year Month Day
## 1 1981 01 01
## 2 1981 01 02
## 3 1981 01 03
## 4 1981 01 04
## 5 1981 01 05
## 6 1981 01 06
## Daily.minimum.temperatures.in.Melbourne..Australia..1981.1990
## 1 20.7
## 2 17.9
## 3 18.8
## 4 14.6
## 5 15.8
## 6 15.8
In this tutorial we have learnt how to split one column into different ones using the function separate. This could be very useful when working with patterns in character variables.