In this lesson you’ll learn about two useful operators. The pipe
operator, %>%, allows you to chain functions together. The %in%
operator allows you to evaluate if a value is in a vector of values.
Preliminaries
Install the dplyr and magrittr packages if you haven’t already done
so. If you have already done so, then erase or comment out the following
code chunk.
install.packages('magrittr')
Error in install.packages : Updating loaded packages
Load the dplyr and magrittr packages.
library(dplyr)
library(magrittr)
Make sure that this file and the jan17Items.csv file are in the same
folder and that the working directory is set to that folder.
Read in the jan17Items data as j17i.
j17i <- read.csv('jan17Items.csv')
Chaining Functions Together with the Pipe Operator: %>%
Assume that we want to filter the j17i dataframe to only the
observations for which the value in the Cost column is
greater than 11, and then select only the Cost and
Price columns. There are at least two ways that you could
do this.
First, you can create an intermediate dataframe.
df1 <- filter(j17i, Cost > 11)
df2 <- select(df1, Cost, Price)
Second, you can nest functions.
df3 <- select(filter(j17i, Cost > 11), Cost, Price)
The first method creates objects that clutter up your working
environment unless you remove them. The second method is confusing to
read. The pipe operator makes things much easier to read by taking the
output from one function and using it as the input to another function.
Often times I find that it’s easier to read if each function is on a new
line. Here’s how we can use the pipe operator to perform these two
functions:
df4 <- j17i %>%
filter(Cost > 11) %>%
select(Cost, Price)
Notice that this is easy to write and read, and it takes fewer
characters. It also prevents the creation of additional objects that are
not needed and that end up cluttering the environment and taking up
memory.
Evaluating if an Element is in a Vector or Dataframe
Assume that we want to filter observations down to those for which
the line item purchased was one of several. You could use the filter
function and manually type out a lot of OR statements, or you can use
the %in% operator like this:
df5 <- j17i %>%
filter(LineItem %in% c('Glass Mug', 'Gift Cards'))
This %in% operator can be especially handy especially if
the number of items in the list is long or dynamic. For instance, let’s
assume that we want to only look at observations for cardholders who
purchased at least one high-cost item. Here’s how we could do that.
highCostItems <- j17i %>%
filter(Cost > 7)
custsThatPurchaseHighCostItems <- j17i %>%
filter(CardholderName %in% highCostItems$CardholderName)
Consider how easy it would be to update the observations if we assume
that high-cost items are those that are above 7 dollars rather than 11
dollars.
LS0tDQp0aXRsZTogIlVzZWZ1bCBPcGVyYXRvcnM6ICU+JSBhbmQgJWluJSINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQpJbiB0aGlzIGxlc3NvbiB5b3UnbGwgbGVhcm4gYWJvdXQgdHdvIHVzZWZ1bCBvcGVyYXRvcnMuIFRoZSBwaXBlIG9wZXJhdG9yLCAlPiUsIGFsbG93cyB5b3UgdG8gY2hhaW4gZnVuY3Rpb25zIHRvZ2V0aGVyLiBUaGUgJWluJSBvcGVyYXRvciBhbGxvd3MgeW91IHRvIGV2YWx1YXRlIGlmIGEgdmFsdWUgaXMgaW4gYSB2ZWN0b3Igb2YgdmFsdWVzLg0KDQojIyBQcmVsaW1pbmFyaWVzDQpJbnN0YWxsIHRoZSBkcGx5ciBhbmQgbWFncml0dHIgcGFja2FnZXMgaWYgeW91IGhhdmVuJ3QgYWxyZWFkeSBkb25lIHNvLiBJZiB5b3UgaGF2ZSBhbHJlYWR5IGRvbmUgc28sIHRoZW4gZXJhc2Ugb3IgY29tbWVudCBvdXQgdGhlIGZvbGxvd2luZyBjb2RlIGNodW5rLg0KYGBge3J9DQojIGluc3RhbGwucGFja2FnZXMoJ2RwbHlyJykNCiMgaW5zdGFsbC5wYWNrYWdlcygnbWFncml0dHInKQ0KYGBgDQpMb2FkIHRoZSBkcGx5ciBhbmQgbWFncml0dHIgcGFja2FnZXMuDQpgYGB7cn0NCmxpYnJhcnkoZHBseXIpDQpsaWJyYXJ5KG1hZ3JpdHRyKQ0KYGBgDQpNYWtlIHN1cmUgdGhhdCB0aGlzIGZpbGUgYW5kIHRoZSBqYW4xN0l0ZW1zLmNzdiBmaWxlIGFyZSBpbiB0aGUgc2FtZSBmb2xkZXIgYW5kIHRoYXQgdGhlIHdvcmtpbmcgZGlyZWN0b3J5IGlzIHNldCB0byB0aGF0IGZvbGRlci4NCg0KUmVhZCBpbiB0aGUgamFuMTdJdGVtcyBkYXRhIGFzIGoxN2kuDQpgYGB7cn0NCmoxN2kgPC0gcmVhZC5jc3YoJ2phbjE3SXRlbXMuY3N2JykNCg0KYGBgDQoNCiMjIENoYWluaW5nIEZ1bmN0aW9ucyBUb2dldGhlciB3aXRoIHRoZSBQaXBlIE9wZXJhdG9yOiAlPiUNCkFzc3VtZSB0aGF0IHdlIHdhbnQgdG8gZmlsdGVyIHRoZSBqMTdpIGRhdGFmcmFtZSB0byBvbmx5IHRoZSBvYnNlcnZhdGlvbnMgZm9yIHdoaWNoIHRoZSB2YWx1ZSBpbiB0aGUgYENvc3RgIGNvbHVtbiBpcyBncmVhdGVyIHRoYW4gMTEsIGFuZCB0aGVuIHNlbGVjdCBvbmx5IHRoZSBgQ29zdGAgYW5kIGBQcmljZWAgY29sdW1ucy4gVGhlcmUgYXJlIGF0IGxlYXN0IHR3byB3YXlzIHRoYXQgeW91IGNvdWxkIGRvIHRoaXMuDQoNCkZpcnN0LCB5b3UgY2FuIGNyZWF0ZSBhbiBpbnRlcm1lZGlhdGUgZGF0YWZyYW1lLg0KYGBge3J9DQpkZjEgPC0gZmlsdGVyKGoxN2ksIENvc3QgPiAxMSkNCmRmMiA8LSBzZWxlY3QoZGYxLCBDb3N0LCBQcmljZSkNCmBgYA0KDQpTZWNvbmQsIHlvdSBjYW4gbmVzdCBmdW5jdGlvbnMuDQpgYGB7cn0NCmRmMyA8LSBzZWxlY3QoZmlsdGVyKGoxN2ksIENvc3QgPiAxMSksIENvc3QsIFByaWNlKQ0KYGBgDQoNClRoZSBmaXJzdCBtZXRob2QgY3JlYXRlcyBvYmplY3RzIHRoYXQgY2x1dHRlciB1cCB5b3VyIHdvcmtpbmcgZW52aXJvbm1lbnQgdW5sZXNzIHlvdSByZW1vdmUgdGhlbS4gVGhlIHNlY29uZCBtZXRob2QgaXMgY29uZnVzaW5nIHRvIHJlYWQuIFRoZSBwaXBlIG9wZXJhdG9yIG1ha2VzIHRoaW5ncyBtdWNoIGVhc2llciB0byByZWFkIGJ5IHRha2luZyB0aGUgb3V0cHV0IGZyb20gb25lIGZ1bmN0aW9uIGFuZCB1c2luZyBpdCBhcyB0aGUgaW5wdXQgdG8gYW5vdGhlciBmdW5jdGlvbi4gT2Z0ZW4gdGltZXMgSSBmaW5kIHRoYXQgaXQncyBlYXNpZXIgdG8gcmVhZCBpZiBlYWNoIGZ1bmN0aW9uIGlzIG9uIGEgbmV3IGxpbmUuIEhlcmUncyBob3cgd2UgY2FuIHVzZSB0aGUgcGlwZSBvcGVyYXRvciB0byBwZXJmb3JtIHRoZXNlIHR3byBmdW5jdGlvbnM6DQpgYGB7cn0NCmRmNCA8LSBqMTdpICU+JQ0KICBmaWx0ZXIoQ29zdCA+IDExKSAlPiUNCiAgc2VsZWN0KENvc3QsIFByaWNlKQ0KYGBgDQpOb3RpY2UgdGhhdCB0aGlzIGlzIGVhc3kgdG8gd3JpdGUgYW5kIHJlYWQsIGFuZCBpdCB0YWtlcyBmZXdlciBjaGFyYWN0ZXJzLiBJdCBhbHNvIHByZXZlbnRzIHRoZSBjcmVhdGlvbiBvZiBhZGRpdGlvbmFsIG9iamVjdHMgdGhhdCBhcmUgbm90IG5lZWRlZCBhbmQgdGhhdCBlbmQgdXAgY2x1dHRlcmluZyB0aGUgZW52aXJvbm1lbnQgYW5kIHRha2luZyB1cCBtZW1vcnkuDQoNCiMjIEV2YWx1YXRpbmcgaWYgYW4gRWxlbWVudCBpcyBpbiBhIFZlY3RvciBvciBEYXRhZnJhbWUNCkFzc3VtZSB0aGF0IHdlIHdhbnQgdG8gZmlsdGVyIG9ic2VydmF0aW9ucyBkb3duIHRvIHRob3NlIGZvciB3aGljaCB0aGUgbGluZSBpdGVtIHB1cmNoYXNlZCB3YXMgb25lIG9mIHNldmVyYWwuIFlvdSBjb3VsZCB1c2UgdGhlIGZpbHRlciBmdW5jdGlvbiBhbmQgbWFudWFsbHkgdHlwZSBvdXQgYSBsb3Qgb2YgT1Igc3RhdGVtZW50cywgb3IgeW91IGNhbiB1c2UgdGhlICVpbiUgb3BlcmF0b3IgbGlrZSB0aGlzOg0KYGBge3J9DQpkZjUgPC0gajE3aSAlPiUNCiAgZmlsdGVyKExpbmVJdGVtICVpbiUgYygnR2xhc3MgTXVnJywgJ0dpZnQgQ2FyZHMnKSkNCmBgYA0KVGhpcyBgJWluJWAgb3BlcmF0b3IgY2FuIGJlIGVzcGVjaWFsbHkgaGFuZHkgZXNwZWNpYWxseSBpZiB0aGUgbnVtYmVyIG9mIGl0ZW1zIGluIHRoZSBsaXN0IGlzIGxvbmcgb3IgZHluYW1pYy4gRm9yIGluc3RhbmNlLCBsZXQncyBhc3N1bWUgdGhhdCB3ZSB3YW50IHRvIG9ubHkgbG9vayBhdCBvYnNlcnZhdGlvbnMgZm9yIGNhcmRob2xkZXJzIHdobyBwdXJjaGFzZWQgYXQgbGVhc3Qgb25lIGhpZ2gtY29zdCBpdGVtLiBIZXJlJ3MgaG93IHdlIGNvdWxkIGRvIHRoYXQuDQpgYGB7cn0NCmhpZ2hDb3N0SXRlbXMgPC0gajE3aSAlPiUNCiAgZmlsdGVyKENvc3QgPiA3KQ0KDQpjdXN0c1RoYXRQdXJjaGFzZUhpZ2hDb3N0SXRlbXMgPC0gajE3aSAlPiUNCiAgZmlsdGVyKENhcmRob2xkZXJOYW1lICVpbiUgaGlnaENvc3RJdGVtcyRDYXJkaG9sZGVyTmFtZSkNCmBgYA0KQ29uc2lkZXIgaG93IGVhc3kgaXQgd291bGQgYmUgdG8gdXBkYXRlIHRoZSBvYnNlcnZhdGlvbnMgaWYgd2UgYXNzdW1lIHRoYXQgaGlnaC1jb3N0IGl0ZW1zIGFyZSB0aG9zZSB0aGF0IGFyZSBhYm92ZSA3IGRvbGxhcnMgcmF0aGVyIHRoYW4gMTEgZG9sbGFycy4NCg0KIyMjIENvbmNsdWRpbmcgQ29tbWVudHMNClRoaXMgZGVtb25zdHJhdGVzIGhvdyB1c2VmdWwgdGhlc2Ugb3BlcmF0b3JzIGNhbiBiZSBpbiB0aGUgY29udGV4dCBvZiBkYXRhZnJhbWVzLCBidXQgdGhlc2Ugb3BlcmF0b3JzIGFyZSBhbHNvIHVzZWZ1bCB3aXRoIG90aGVyIG9iamVjdHMsIGxpa2UgdmVjdG9ycy4NCg==