This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.

Piping using code such as %>%

Piping is a development that began with the magrittr package. The purpose was to make code writing similar to text writing, i.e., from left to right (mostly).

The %>% piping command basically says to take what is on the left and pass it along to whatever occurs on the right.

First, four packages need to be loaded.

library(babynames) # data package
library(dplyr)     # provides data manipulating functions.
library(magrittr)  # ceci n'est pas un pipe
library(ggplot2)   # for graphics

‘babynames’ is included in the babynames package. Here we just want to see what’s in the first 6 rows of babynames.

head(babynames)
## # A tibble: 6 × 5
##    year   sex      name     n       prop
##   <dbl> <chr>     <chr> <int>      <dbl>
## 1  1880     F      Mary  7065 0.07238359
## 2  1880     F      Anna  2604 0.02667896
## 3  1880     F      Emma  2003 0.02052149
## 4  1880     F Elizabeth  1939 0.01986579
## 5  1880     F    Minnie  1746 0.01788843
## 6  1880     F  Margaret  1578 0.01616720

Next, we pass the babynames data along to a filter that extracts the first 3 leftmost letters from each babyname.

Then the list of those 3 letter names is filtered and only those that begin with “Ste” are to remain.

babynames %>%
  filter(name %>% substr(1, 3) %>% equals("Ste"))
## # A tibble: 5,663 × 5
##     year   sex     name     n         prop
##    <dbl> <chr>    <chr> <int>        <dbl>
## 1   1880     F   Stella   414 0.0042415860
## 2   1880     M  Stephen   176 0.0014864865
## 3   1880     M    Steve    52 0.0004391892
## 4   1880     M  Stewart    19 0.0001604730
## 5   1880     M Sterling    17 0.0001435811
## 6   1880     M   Steven    17 0.0001435811
## 7   1881     F   Stella   416 0.0042081411
## 8   1881     M  Stephen   147 0.0013575413
## 9   1881     M    Steve    44 0.0004063389
## 10  1881     M  Stewart    27 0.0002493443
## # ... with 5,653 more rows

Now those babynames that begin with Ste are grouped by year and then by sex.

babynames %>%
  filter(name %>% substr(1, 3) %>% equals("Ste")) %>%
group_by(year, sex)
## Source: local data frame [5,663 x 5]
## Groups: year, sex [270]
## 
##     year   sex     name     n         prop
##    <dbl> <chr>    <chr> <int>        <dbl>
## 1   1880     F   Stella   414 0.0042415860
## 2   1880     M  Stephen   176 0.0014864865
## 3   1880     M    Steve    52 0.0004391892
## 4   1880     M  Stewart    19 0.0001604730
## 5   1880     M Sterling    17 0.0001435811
## 6   1880     M   Steven    17 0.0001435811
## 7   1881     F   Stella   416 0.0042081411
## 8   1881     M  Stephen   147 0.0013575413
## 9   1881     M    Steve    44 0.0004063389
## 10  1881     M  Stewart    27 0.0002493443
## # ... with 5,653 more rows

This last stage summarizes and counts the number of Ste names by year and by sex within each year.

babynames %>%
  filter(name %>% substr(1, 3) %>% equals("Ste")) %>%
group_by(year, sex)%>%
  summarize(total = sum(n))
## Source: local data frame [270 x 3]
## Groups: year [?]
## 
##     year   sex total
##    <dbl> <chr> <int>
## 1   1880     F   414
## 2   1880     M   281
## 3   1881     F   416
## 4   1881     M   241
## 5   1882     F   506
## 6   1882     M   327
## 7   1883     F   529
## 8   1883     M   253
## 9   1884     F   584
## 10  1884     M   292
## # ... with 260 more rows