This lesson introduces two functions from the tidyr package for pivoting dataframes between wide and long formats. The tidyr package is part of the tidyverse, and it has functions for reshaping dataframes.

The shape of a dataframe refers to the number of rows and columns. Many plotting functions and dashboard applications work best with long dataframes that have few columns and many rows. In contrast, many algorithms and human readable tables work best with wide dataframes that have few rows and many columns.

Preliminaries

If you haven’t already installed the tidyr package on the machine that you’re using, then you can do so by running the next code chunk. If you have already installed it, then you do not need to run it again.

# install.packages('tidyr')

Load the dplyr, magrittr, and lubridate packages.

library(dplyr)
library(magrittr)
library(lubridate)
library(tidyr)

Make sure that this file and the jan17Items.csv file are in the same folder and that the working directory is set to that folder.

Read in the jan17Items data as j17i.

j17i <- read.csv('jan17Items.csv')

Aggregate the Data at the Day Level

Aggregate the data into a single observation for each day. Include a computation for the average cost, the maximum price, and the number of transactions for each day.

daily <- j17i %>%
  mutate(
    Time = ymd_hms(Time)
    , date = round_date(Time, 'day')
  ) %>%
  group_by(date) %>%
  summarise(avgCost = mean(Cost, na.rm = T)
            , maxPrice = max(Price, na.rm = T)
            , transactionQuantity = n_distinct(TransactionNumber)) %>%
  ungroup()

Pivot Longer

Pivoting to a longer dataframe means that we’re going to collapse the names for some columns into a single column, and the values from those columns into another single column. This will require duplicating the column(s) that identify the observation.

dailyLong <- daily %>%
  pivot_longer(cols = c(avgCost, maxPrice, transactionQuantity))

# Alternative approaches that also work. See the <tidy-select> link in the documentation
dailyLong <- daily %>%
  pivot_longer(cols = avgCost:transactionQuantity)
dailyLong <- daily %>%
  pivot_longer(cols = c('avgCost', 'maxPrice', 'transactionQuantity'))

Notice that the shape of this dataframe is 90 rows long and 3 columns wide rather than 30 columns long and four columns wide.

Let’s specify what we want the names to be for the names_to and values_to columns.

dailyLong <- daily %>%
  pivot_longer(cols = avgCost:transactionQuantity
               , names_to = 'metrics'
               , values_to = 'vals')

Pivot Wider

Let’s now pivot the long dataframe back to a wide version of the dataframe using the pivot_wider function. Let’s read the help documentation by running ?pivot_wider in the Console. You can see that it’s a parallel version of the pivot_longer function.

dailyWide <- dailyLong %>%
  pivot_wider(names_from = metrics, values_from = vals)

A visual inspection confirms that the dailyWide dataframe appears to be the same as the daily dataframe, showing that we successfully pivoted back to the wide format.

Concluding Comments

It’s nice that the pivot_wider and pivot_longer functions are parallel in their usage. These functions have many other optional parameters that you can explore, but many of those options can be performed after the dataframe has been reshaped using the mutate function.

Having a little bit of history using the tidyverse, I really like how easy it has become to use the pivot_wider and pivot_longer functions. Because they’re so easy to use and because pivoting the shape of the dataframe is done so frequently, I consider them essential to know how to use if you’re going to do any kind of data wrangling in R.

LS0tDQp0aXRsZTogIlBpdm90aW5nIERhdGFmcmFtZXMgQmV0d2VlbiBXaWRlIGFuZCBMb25nIFNoYXBlcyINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQpUaGlzIGxlc3NvbiBpbnRyb2R1Y2VzIHR3byBmdW5jdGlvbnMgZnJvbSB0aGUgdGlkeXIgcGFja2FnZSBmb3IgcGl2b3RpbmcgZGF0YWZyYW1lcyBiZXR3ZWVuIHdpZGUgYW5kIGxvbmcgZm9ybWF0cy4gVGhlIHRpZHlyIHBhY2thZ2UgaXMgcGFydCBvZiB0aGUgdGlkeXZlcnNlLCBhbmQgaXQgaGFzIGZ1bmN0aW9ucyBmb3IgcmVzaGFwaW5nIGRhdGFmcmFtZXMuIA0KDQpUaGUgc2hhcGUgb2YgYSBkYXRhZnJhbWUgcmVmZXJzIHRvIHRoZSBudW1iZXIgb2Ygcm93cyBhbmQgY29sdW1ucy4gTWFueSBwbG90dGluZyBmdW5jdGlvbnMgYW5kIGRhc2hib2FyZCBhcHBsaWNhdGlvbnMgd29yayBiZXN0IHdpdGggbG9uZyBkYXRhZnJhbWVzIHRoYXQgaGF2ZSBmZXcgY29sdW1ucyBhbmQgbWFueSByb3dzLiBJbiBjb250cmFzdCwgbWFueSBhbGdvcml0aG1zIGFuZCBodW1hbiByZWFkYWJsZSB0YWJsZXMgd29yayBiZXN0IHdpdGggd2lkZSBkYXRhZnJhbWVzIHRoYXQgaGF2ZSBmZXcgcm93cyBhbmQgbWFueSBjb2x1bW5zLg0KDQojIyBQcmVsaW1pbmFyaWVzDQpJZiB5b3UgaGF2ZW4ndCBhbHJlYWR5IGluc3RhbGxlZCB0aGUgdGlkeXIgcGFja2FnZSBvbiB0aGUgbWFjaGluZSB0aGF0IHlvdSdyZSB1c2luZywgdGhlbiB5b3UgY2FuIGRvIHNvIGJ5IHJ1bm5pbmcgdGhlIG5leHQgY29kZSBjaHVuay4gSWYgeW91IGhhdmUgYWxyZWFkeSBpbnN0YWxsZWQgaXQsIHRoZW4geW91IGRvIG5vdCBuZWVkIHRvIHJ1biBpdCBhZ2Fpbi4NCmBgYHtyfQ0KIyBpbnN0YWxsLnBhY2thZ2VzKCd0aWR5cicpDQpgYGANCg0KTG9hZCB0aGUgZHBseXIsIG1hZ3JpdHRyLCBhbmQgbHVicmlkYXRlIHBhY2thZ2VzLg0KYGBge3J9DQpsaWJyYXJ5KGRwbHlyKQ0KbGlicmFyeShtYWdyaXR0cikNCmxpYnJhcnkobHVicmlkYXRlKQ0KbGlicmFyeSh0aWR5cikNCmBgYA0KTWFrZSBzdXJlIHRoYXQgdGhpcyBmaWxlIGFuZCB0aGUgamFuMTdJdGVtcy5jc3YgZmlsZSBhcmUgaW4gdGhlIHNhbWUgZm9sZGVyIGFuZCB0aGF0IHRoZSB3b3JraW5nIGRpcmVjdG9yeSBpcyBzZXQgdG8gdGhhdCBmb2xkZXIuDQoNClJlYWQgaW4gdGhlIGphbjE3SXRlbXMgZGF0YSBhcyBqMTdpLg0KYGBge3J9DQpqMTdpIDwtIHJlYWQuY3N2KCdqYW4xN0l0ZW1zLmNzdicpDQpgYGANCg0KIyMgQWdncmVnYXRlIHRoZSBEYXRhIGF0IHRoZSBEYXkgTGV2ZWwNCkFnZ3JlZ2F0ZSB0aGUgZGF0YSBpbnRvIGEgc2luZ2xlIG9ic2VydmF0aW9uIGZvciBlYWNoIGRheS4gSW5jbHVkZSBhIGNvbXB1dGF0aW9uIGZvciB0aGUgYXZlcmFnZSBjb3N0LCB0aGUgbWF4aW11bSBwcmljZSwgYW5kIHRoZSBudW1iZXIgb2YgdHJhbnNhY3Rpb25zIGZvciBlYWNoIGRheS4NCg0KYGBge3J9DQpkYWlseSA8LSBqMTdpICU+JQ0KICBtdXRhdGUoDQogICAgVGltZSA9IHltZF9obXMoVGltZSkNCiAgICAsIGRhdGUgPSByb3VuZF9kYXRlKFRpbWUsICdkYXknKQ0KICApICU+JQ0KICBncm91cF9ieShkYXRlKSAlPiUNCiAgc3VtbWFyaXNlKGF2Z0Nvc3QgPSBtZWFuKENvc3QsIG5hLnJtID0gVCkNCiAgICAgICAgICAgICwgbWF4UHJpY2UgPSBtYXgoUHJpY2UsIG5hLnJtID0gVCkNCiAgICAgICAgICAgICwgdHJhbnNhY3Rpb25RdWFudGl0eSA9IG5fZGlzdGluY3QoVHJhbnNhY3Rpb25OdW1iZXIpKSAlPiUNCiAgdW5ncm91cCgpDQpgYGANCg0KIyMgUGl2b3QgTG9uZ2VyDQpQaXZvdGluZyB0byBhIGxvbmdlciBkYXRhZnJhbWUgbWVhbnMgdGhhdCB3ZSdyZSBnb2luZyB0byBjb2xsYXBzZSB0aGUgKipuYW1lcyoqIGZvciBzb21lIGNvbHVtbnMgaW50byBhIHNpbmdsZSBjb2x1bW4sIGFuZCB0aGUgKip2YWx1ZXMqKiBmcm9tIHRob3NlIGNvbHVtbnMgaW50byBhbm90aGVyIHNpbmdsZSBjb2x1bW4uIFRoaXMgd2lsbCByZXF1aXJlIGR1cGxpY2F0aW5nIHRoZSBjb2x1bW4ocykgdGhhdCBpZGVudGlmeSB0aGUgb2JzZXJ2YXRpb24uDQoNCmBgYHtyfQ0KZGFpbHlMb25nIDwtIGRhaWx5ICU+JQ0KICBwaXZvdF9sb25nZXIoY29scyA9IGMoYXZnQ29zdCwgbWF4UHJpY2UsIHRyYW5zYWN0aW9uUXVhbnRpdHkpKQ0KDQojIEFsdGVybmF0aXZlIGFwcHJvYWNoZXMgdGhhdCBhbHNvIHdvcmsuIFNlZSB0aGUgPHRpZHktc2VsZWN0PiBsaW5rIGluIHRoZSBkb2N1bWVudGF0aW9uDQpkYWlseUxvbmcgPC0gZGFpbHkgJT4lDQogIHBpdm90X2xvbmdlcihjb2xzID0gYXZnQ29zdDp0cmFuc2FjdGlvblF1YW50aXR5KQ0KZGFpbHlMb25nIDwtIGRhaWx5ICU+JQ0KICBwaXZvdF9sb25nZXIoY29scyA9IGMoJ2F2Z0Nvc3QnLCAnbWF4UHJpY2UnLCAndHJhbnNhY3Rpb25RdWFudGl0eScpKQ0KYGBgDQpOb3RpY2UgdGhhdCB0aGUgc2hhcGUgb2YgdGhpcyBkYXRhZnJhbWUgaXMgOTAgcm93cyBsb25nIGFuZCAzIGNvbHVtbnMgd2lkZSByYXRoZXIgdGhhbiAzMCBjb2x1bW5zIGxvbmcgYW5kIGZvdXIgY29sdW1ucyB3aWRlLg0KDQpMZXQncyBzcGVjaWZ5IHdoYXQgd2Ugd2FudCB0aGUgbmFtZXMgdG8gYmUgZm9yIHRoZSBuYW1lc190byBhbmQgdmFsdWVzX3RvIGNvbHVtbnMuDQoNCmBgYHtyfQ0KZGFpbHlMb25nIDwtIGRhaWx5ICU+JQ0KICBwaXZvdF9sb25nZXIoY29scyA9IGF2Z0Nvc3Q6dHJhbnNhY3Rpb25RdWFudGl0eQ0KICAgICAgICAgICAgICAgLCBuYW1lc190byA9ICdtZXRyaWNzJw0KICAgICAgICAgICAgICAgLCB2YWx1ZXNfdG8gPSAndmFscycpDQpgYGANCg0KIyMgUGl2b3QgV2lkZXINCkxldCdzIG5vdyBwaXZvdCB0aGUgbG9uZyBkYXRhZnJhbWUgYmFjayB0byBhIHdpZGUgdmVyc2lvbiBvZiB0aGUgZGF0YWZyYW1lIHVzaW5nIHRoZSBwaXZvdF93aWRlciBmdW5jdGlvbi4gTGV0J3MgcmVhZCB0aGUgaGVscCBkb2N1bWVudGF0aW9uIGJ5IHJ1bm5pbmcgP3Bpdm90X3dpZGVyIGluIHRoZSBDb25zb2xlLiBZb3UgY2FuIHNlZSB0aGF0IGl0J3MgYSBwYXJhbGxlbCB2ZXJzaW9uIG9mIHRoZSBwaXZvdF9sb25nZXIgZnVuY3Rpb24uDQoNCmBgYHtyfQ0KZGFpbHlXaWRlIDwtIGRhaWx5TG9uZyAlPiUNCiAgcGl2b3Rfd2lkZXIobmFtZXNfZnJvbSA9IG1ldHJpY3MsIHZhbHVlc19mcm9tID0gdmFscykNCmBgYA0KQSB2aXN1YWwgaW5zcGVjdGlvbiBjb25maXJtcyB0aGF0IHRoZSBkYWlseVdpZGUgZGF0YWZyYW1lIGFwcGVhcnMgdG8gYmUgdGhlIHNhbWUgYXMgdGhlIGRhaWx5IGRhdGFmcmFtZSwgc2hvd2luZyB0aGF0IHdlIHN1Y2Nlc3NmdWxseSBwaXZvdGVkIGJhY2sgdG8gdGhlIHdpZGUgZm9ybWF0Lg0KDQoNCiMjIENvbmNsdWRpbmcgQ29tbWVudHMNCkl0J3MgbmljZSB0aGF0IHRoZSBwaXZvdF93aWRlciBhbmQgcGl2b3RfbG9uZ2VyIGZ1bmN0aW9ucyBhcmUgcGFyYWxsZWwgaW4gdGhlaXIgdXNhZ2UuIFRoZXNlIGZ1bmN0aW9ucyBoYXZlIG1hbnkgb3RoZXIgb3B0aW9uYWwgcGFyYW1ldGVycyB0aGF0IHlvdSBjYW4gZXhwbG9yZSwgYnV0IG1hbnkgb2YgdGhvc2Ugb3B0aW9ucyBjYW4gYmUgcGVyZm9ybWVkIGFmdGVyIHRoZSBkYXRhZnJhbWUgaGFzIGJlZW4gcmVzaGFwZWQgdXNpbmcgdGhlIG11dGF0ZSBmdW5jdGlvbi4NCg0KSGF2aW5nIGEgbGl0dGxlIGJpdCBvZiBoaXN0b3J5IHVzaW5nIHRoZSB0aWR5dmVyc2UsIEkgcmVhbGx5IGxpa2UgaG93IGVhc3kgaXQgaGFzIGJlY29tZSB0byB1c2UgdGhlIHBpdm90X3dpZGVyIGFuZCBwaXZvdF9sb25nZXIgZnVuY3Rpb25zLiBCZWNhdXNlIHRoZXkncmUgc28gZWFzeSB0byB1c2UgYW5kIGJlY2F1c2UgcGl2b3RpbmcgdGhlIHNoYXBlIG9mIHRoZSBkYXRhZnJhbWUgaXMgZG9uZSBzbyBmcmVxdWVudGx5LCBJIGNvbnNpZGVyIHRoZW0gZXNzZW50aWFsIHRvIGtub3cgaG93IHRvIHVzZSBpZiB5b3UncmUgZ29pbmcgdG8gZG8gYW55IGtpbmQgb2YgZGF0YSB3cmFuZ2xpbmcgaW4gUi4=