One of the most interesting aspects of R programming is about changing the shape of the data to get a desired shape.Melting and casting in R, are the functions that can be used efficiently to reshape the data. The functions used to do this are called melt() and cast().
The melt function takes data in wide format and stacks a set of columns into a single column of data. To make use of the function we need to specify a data frame, the id variables (which will be left at their settings) and the measured variables (columns of data) to be stacked. The default assumption on measured variables is that it is all columns that are not specified as id variables.
We will use the inbuilt data in R to understand how melt and cast function works.
library(MASS)
library(reshape2)
library(reshape)
## Warning: package 'reshape' was built under R version 3.4.4
##
## Attaching package: 'reshape'
## The following objects are masked from 'package:reshape2':
##
## colsplit, melt, recast
print(head(ships,n=10))
## type year period service incidents
## 1 A 60 60 127 0
## 2 A 60 75 63 0
## 3 A 65 60 1095 3
## 4 A 65 75 1095 4
## 5 A 70 60 1512 6
## 6 A 70 75 3353 18
## 7 A 75 60 0 0
## 8 A 75 75 2244 11
## 9 B 60 60 44882 39
## 10 B 60 75 17176 29
#This will print first 10 values of the inbuilt ships data
Now lets keep type and year as constant(id variable) and melt (stack) the other three variables namely period, service and incidents.
shipdata<-(head(ships,n=10))
molten.ships <- melt(shipdata, id = c("type","year"))
print(molten.ships)
## type year variable value
## 1 A 60 period 60
## 2 A 60 period 75
## 3 A 65 period 60
## 4 A 65 period 75
## 5 A 70 period 60
## 6 A 70 period 75
## 7 A 75 period 60
## 8 A 75 period 75
## 9 B 60 period 60
## 10 B 60 period 75
## 11 A 60 service 127
## 12 A 60 service 63
## 13 A 65 service 1095
## 14 A 65 service 1095
## 15 A 70 service 1512
## 16 A 70 service 3353
## 17 A 75 service 0
## 18 A 75 service 2244
## 19 B 60 service 44882
## 20 B 60 service 17176
## 21 A 60 incidents 0
## 22 A 60 incidents 0
## 23 A 65 incidents 3
## 24 A 65 incidents 4
## 25 A 70 incidents 6
## 26 A 70 incidents 18
## 27 A 75 incidents 0
## 28 A 75 incidents 11
## 29 B 60 incidents 39
## 30 B 60 incidents 29
As the result type and year column are kept constant. Columns named period, service and incidents are stacked under the column named variable and their values are stacked under the column named value. The result of melt function is shown below
Aggregation occurs when the combination of variables in the cast function does not identify Individual observations. In this case cast function reduces the multiple values to a single one by summing up the values in the value column. Cast function example is shown below
recasted.ship <- cast(molten.ships, type+year~variable,sum)
print(recasted.ship)
## type year period service incidents
## 1 A 60 135 190 0
## 2 A 65 135 2190 7
## 3 A 70 135 4865 24
## 4 A 75 135 2244 11
## 5 B 60 135 62058 68
As the result cast function sums up the different variables for each type and year and those variables are casted back as columns and result is shown below.
For example Type A year 60 has two periods 60 and 75. This is summed up and result 135 is recorded under the column name period with the help of cast function.