Run the block of code below only once. Notes:
Other to reflect this.# Load packages
library(ggplot2)
library(dplyr)
library(babynames)
# Compute counts/proporations for "Other" names
other <- babynames %>%
group_by(year, sex) %>%
summarise(sum_n=sum(n), sum_prop=sum(prop)) %>%
mutate(
total = sum_n/sum_prop,
name = "Other",
prop = 1-sum_prop,
n = prop * total
) %>%
select(year, sex, name, n, prop)
# Add "Other" names to babynames
babynames <- babynames %>%
bind_rows(other) %>%
arrange(year, sex, desc(n))
Play around with the babynames dataset in the RStudio viewer by running the following in the console.
View(babynames)
In particular, click on the Filter button and click in the white boxes under the variable names and
The code below shows the popularity trend for the name “Jeffrey” amongst males. Switch the name/sex variable values that get assigned and explore the popularity of different names.
baby_name <- "Emma"
baby_sex <- "F"
single_name <- babynames %>%
filter(name==baby_name & sex==baby_sex)
ggplot(data=single_name, mapping = aes(x=year, y=prop)) +
geom_line() +
xlim(c(1880, 2014)) +
ylim(c(0, NA)) +
xlab("Year") +
ylab(paste("Prop. of ", baby_sex, " born with name ", baby_name, sep=""))