R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document.

The BabyNames Data

Let’s take a look at the first few rows of the data frame BabyNames from the package DataComputing.

head(BabyNames, n = 10)
##         name sex count year
## 1       Mary   F  7065 1880
## 2       Anna   F  2604 1880
## 3       Emma   F  2003 1880
## 4  Elizabeth   F  1939 1880
## 5     Minnie   F  1746 1880
## 6   Margaret   F  1578 1880
## 7        Ida   F  1472 1880
## 8      Alice   F  1414 1880
## 9     Bertha   F  1320 1880
## 10     Sarah   F  1288 1880

You can get more information on BabyNames using the help() function, like this:

help("BabyNames")

In R Studio a help-file will show up in the Help tab. (Note that when you knit this document, the eval = F code-chunk option prevents the above code from being executed, even though it is appears in the knitted version.)

In R Studio you can get a “spreadsheet-look” at BabyNames if you use the View() function:

View(BabyNames)

Our aim is to use some data-wrangling and data-visualization to learn about the popularity, over time, of various names for babies in the United States.

Popularity of Mary

How popular is the name Mary (for girls) over time? To answer this, we’ll first wrangle the data just a bit: we’ll select only rows where the name is “Mary” and the sex is “F” for female:

BabyNames %>%
  filter(name == "Mary" & sex == "F") %>%
  head(n = 10)
##    name sex count year
## 1  Mary   F  7065 1880
## 2  Mary   F  6919 1881
## 3  Mary   F  8148 1882
## 4  Mary   F  8012 1883
## 5  Mary   F  9217 1884
## 6  Mary   F  9128 1885
## 7  Mary   F  9890 1886
## 8  Mary   F  9888 1887
## 9  Mary   F 11754 1888
## 10 Mary   F 11648 1889

Now we’ll make a line-graph of the counts over time:

BabyNames %>%
  filter(name == "Mary" & sex == "F") %>%
  ggplot(aes(x = year, y = count)) +
  geom_line() + labs(x = "Year", y = "Number Born", 
                     title = "Mary as a Girl-Name")
This is a caption, used to provide the reader with more information about the figure.  You can determine the caption-text by using the fig.cap option in the code chunk.

This is a caption, used to provide the reader with more information about the figure. You can determine the caption-text by using the fig.cap option in the code chunk.

It was important to restrict to females, because it happens that males can be named Mary, too! The plot below demonstrates this.

Try It!

In the code chunk below, insert the code you need to see how your name has done over the years. Remember to select your sex!

BabyNames %>%
  filter(name == "Cassidy" & sex == "F") %>%
  ggplot(aes(x = year, y = count)) +
  geom_line() + labs(x = "Year", y = "Number Born", 
                     title = "Cassidy as a Girl-Name")
Make a nice caption here!

Make a nice caption here!

Comparing a Name in Both Sexes

Let’s look at the name Leslie, which is often found in either sex.

BabyNames %>% 
  filter(name == "Leslie") %>%
  ggplot(aes(x = year, y = count)) +
  geom_line(aes(color = sex)) + labs(x = "Year", y = "Number Born", 
                     title = "Leslie, by Sex")
It seems that once Leslie became popular as a girl's name, it became quite rare as a name for boys!

It seems that once Leslie became popular as a girl’s name, it became quite rare as a name for boys!

Try It!

Think of another name that is not restricted to one sex, and study the popularity of the name was we did for the Leslie.

Make a nice caption here!

Make a nice caption here!