Babynames

An exploration of first letters

First, let’s import Kaggle’s ‘State Names’ dataset:

library(readr)
StateNames <- read_csv("~/Desktop/StateNames.csv", 
    col_types = cols(Year = col_number(), 
        Gender = col_character()))

This report will be an exploration of the first letter of babynames. This report will explore which first letters are most popular for females and males. Are they the same for both genders? Which letters are more common now and what were in the past? Does the popularity have to do with specific time periods? If so why?

library(babynames)
library(tidyverse)
library(ggthemes)

Let’s look at the name George over time:

babynames %>% filter(name %in% "Rachel" & sex =="F") %>%
  arrange(desc(prop)) %>% 
  head(20) %>% 
  knitr::kable()
year sex name n prop
1985 F Rachel 16358 0.0088618
1984 F Rachel 15837 0.0087842
1996 F Rachel 16115 0.0084069
1986 F Rachel 15452 0.0083749
1995 F Rachel 16044 0.0083509
1987 F Rachel 15643 0.0083481
1983 F Rachel 14592 0.0081549
1993 F Rachel 15971 0.0081023
1991 F Rachel 16343 0.0080384
1988 F Rachel 15344 0.0079809
1994 F Rachel 15486 0.0079455
1992 F Rachel 15837 0.0079015
1989 F Rachel 15356 0.0077093
1990 F Rachel 15705 0.0076462
1982 F Rachel 13869 0.0076456
1997 F Rachel 13787 0.0072228
1981 F Rachel 12586 0.0070382
1980 F Rachel 11622 0.0065279
1998 F Rachel 12201 0.0062956
1999 F Rachel 11624 0.0059727
babynames %>% filter(name %in% "George" & sex =="M") %>%
  arrange(desc(prop)) %>% 
  head(20) %>% 
  ggplot(aes(year, prop)) + geom_line() +
  theme_economist()