Melting and Casting in R:

One of the most interesting aspects of R programming is about changing the shape of the data to get a desired shape.Melting and casting in R, are the functions that can be used efficiently to reshape the data. The functions used to do this are called melt() and cast().

Melt Function in R:

The melt function takes data in wide format and stacks a set of columns into a single column of data. To make use of the function we need to specify a data frame, the id variables (which will be left at their settings) and the measured variables (columns of data) to be stacked. The default assumption on measured variables is that it is all columns that are not specified as id variables.

We will use the inbuilt data in R to understand how melt and cast function works.

library(MASS)
library(reshape2)
library(reshape)
## Warning: package 'reshape' was built under R version 3.4.4
## 
## Attaching package: 'reshape'
## The following objects are masked from 'package:reshape2':
## 
##     colsplit, melt, recast
print(head(ships,n=10))
##    type year period service incidents
## 1     A   60     60     127         0
## 2     A   60     75      63         0
## 3     A   65     60    1095         3
## 4     A   65     75    1095         4
## 5     A   70     60    1512         6
## 6     A   70     75    3353        18
## 7     A   75     60       0         0
## 8     A   75     75    2244        11
## 9     B   60     60   44882        39
## 10    B   60     75   17176        29
#This will print first 10 values of the inbuilt ships data

Now lets keep type and year as constant(id variable) and melt (stack) the other three variables namely period, service and incidents.

shipdata<-(head(ships,n=10))
molten.ships <- melt(shipdata, id = c("type","year"))
print(molten.ships)
##    type year  variable value
## 1     A   60    period    60
## 2     A   60    period    75
## 3     A   65    period    60
## 4     A   65    period    75
## 5     A   70    period    60
## 6     A   70    period    75
## 7     A   75    period    60
## 8     A   75    period    75
## 9     B   60    period    60
## 10    B   60    period    75
## 11    A   60   service   127
## 12    A   60   service    63
## 13    A   65   service  1095
## 14    A   65   service  1095
## 15    A   70   service  1512
## 16    A   70   service  3353
## 17    A   75   service     0
## 18    A   75   service  2244
## 19    B   60   service 44882
## 20    B   60   service 17176
## 21    A   60 incidents     0
## 22    A   60 incidents     0
## 23    A   65 incidents     3
## 24    A   65 incidents     4
## 25    A   70 incidents     6
## 26    A   70 incidents    18
## 27    A   75 incidents     0
## 28    A   75 incidents    11
## 29    B   60 incidents    39
## 30    B   60 incidents    29

As the result type and year column are kept constant. Columns named period, service and incidents are stacked under the column named variable and their values are stacked under the column named value. The result of melt function is shown below

Cast Function in R:

Aggregation occurs when the combination of variables in the cast function does not identify Individual observations. In this case cast function reduces the multiple values to a single one by summing up the values in the value column. Cast function example is shown below

recasted.ship <- cast(molten.ships, type+year~variable,sum)
 print(recasted.ship)
##   type year period service incidents
## 1    A   60    135     190         0
## 2    A   65    135    2190         7
## 3    A   70    135    4865        24
## 4    A   75    135    2244        11
## 5    B   60    135   62058        68

As the result cast function sums up the different variables for each type and year and those variables are casted back as columns and result is shown below.

For example Type A year 60 has two periods 60 and 75. This is summed up and result 135 is recorded under the column name period with the help of cast function.