Babynames
An exploration of first letters
First, let’s import Kaggle’s ‘State Names’ dataset:
library(readr)
StateNames <- read_csv("~/Desktop/StateNames.csv",
col_types = cols(Year = col_number(),
Gender = col_character()))This report will be an exploration of the first letter of babynames. This report will explore which first letters are most popular for females and males. Are they the same for both genders? Which letters are more common now and what were in the past? Does the popularity have to do with specific time periods? If so why?
library(babynames)
library(tidyverse)
library(ggthemes)Let’s look at the name George over time:
babynames %>% filter(name %in% "Rachel" & sex =="F") %>%
arrange(desc(prop)) %>%
head(20) %>%
knitr::kable()| year | sex | name | n | prop |
|---|---|---|---|---|
| 1985 | F | Rachel | 16358 | 0.0088618 |
| 1984 | F | Rachel | 15837 | 0.0087842 |
| 1996 | F | Rachel | 16115 | 0.0084069 |
| 1986 | F | Rachel | 15452 | 0.0083749 |
| 1995 | F | Rachel | 16044 | 0.0083509 |
| 1987 | F | Rachel | 15643 | 0.0083481 |
| 1983 | F | Rachel | 14592 | 0.0081549 |
| 1993 | F | Rachel | 15971 | 0.0081023 |
| 1991 | F | Rachel | 16343 | 0.0080384 |
| 1988 | F | Rachel | 15344 | 0.0079809 |
| 1994 | F | Rachel | 15486 | 0.0079455 |
| 1992 | F | Rachel | 15837 | 0.0079015 |
| 1989 | F | Rachel | 15356 | 0.0077093 |
| 1990 | F | Rachel | 15705 | 0.0076462 |
| 1982 | F | Rachel | 13869 | 0.0076456 |
| 1997 | F | Rachel | 13787 | 0.0072228 |
| 1981 | F | Rachel | 12586 | 0.0070382 |
| 1980 | F | Rachel | 11622 | 0.0065279 |
| 1998 | F | Rachel | 12201 | 0.0062956 |
| 1999 | F | Rachel | 11624 | 0.0059727 |
babynames %>% filter(name %in% "George" & sex =="M") %>%
arrange(desc(prop)) %>%
head(20) %>%
ggplot(aes(year, prop)) + geom_line() +
theme_economist()