Babynames assignment

library(babynames)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.2.5
## ✔ tibble  2.0.0     ✔ dplyr   0.7.8
## ✔ tidyr   0.8.2     ✔ stringr 1.3.1
## ✔ readr   1.3.1     ✔ forcats 0.3.0
## ── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(wordcloud2)

For this first assignment I chose to do my name, Connor.

  1. Determine its rank the year you were born.
babynames %>%
  filter(year==1995) %>%
  top_n(10,prop) %>%
  arrange(-n) %>%
  mutate(rank=row_number()) %>% 
  mutate(percent=round(prop*100, 1))
## # A tibble: 10 x 7
##     year sex   name            n   prop  rank percent
##    <dbl> <chr> <chr>       <int>  <dbl> <int>   <dbl>
##  1  1995 M     Michael     41402 0.0206     1     2.1
##  2  1995 M     Matthew     32876 0.0163     2     1.6
##  3  1995 M     Christopher 32673 0.0162     3     1.6
##  4  1995 M     Jacob       31129 0.0155     4     1.5
##  5  1995 M     Joshua      30717 0.0153     5     1.5
##  6  1995 M     Nicholas    29155 0.0145     6     1.4
##  7  1995 M     Tyler       29154 0.0145     7     1.4
##  8  1995 F     Jessica     27935 0.0145     8     1.5
##  9  1995 M     Brandon     26904 0.0134     9     1.3
## 10  1995 F     Ashley      26602 0.0138    10     1.4

Here are the top 10 names for 1995. Although my first name isn’t on there, my last name (Brandon) is which I find very interesting.

babynames %>%
  filter(year==1995) %>%
  mutate(rank=row_number()) %>% 
  mutate(percent=round(prop*100, 1)) %>%
  filter(name=="Connor")
## # A tibble: 2 x 7
##    year sex   name       n      prop  rank percent
##   <dbl> <chr> <chr>  <int>     <dbl> <int>   <dbl>
## 1  1995 F     Connor   123 0.0000640  1339     0  
## 2  1995 M     Connor  6616 0.00329   15808     0.3

Here is my name, ranked 15,808th in popularity the year I was born. 6,616 boys (0.3%) and 123 girls (nearly 0%) were given this name, so clearly it wasn’t too popular at the time.

  1. Create a word cloud of the names of your sex and the year you were born.
babynames %>%
  filter(year==1995, sex=="M") %>%
  top_n(100,n) %>%
  select(name,n) %>%
  mutate(rank=row_number()) %>% 
  wordcloud2(size=.4)

Here is a word cloud of the top 100 names of boys born in 1995. The size of the name indicates its popularity- larger names are more popular, smaller names are less popular. My name is up there, but it’s not very big.

  1. Graph its popularity over time.
babynames %>%
  filter(name=="Connor") %>%
  ggplot(aes(x=year,y=prop))+
  geom_line()

Here is the popularity for my name graphed over time for boys and girls.

babynames %>%
  filter(name=="Connor") %>%
  filter(sex=="M") %>%
  ggplot(aes(x=year,y=prop))+
  geom_line()

Here is my name graphed over time for just boys. Popularity took off in the 1990’s and really became popular just after the new millenium, and it has stayed relatively popular.

babynames %>%
  filter(name=="Connor") %>%
  filter(sex=="F") %>%
  ggplot(aes(x=year,y=prop))+
  geom_line()

Here is my name graphed over time for just girls. It wasn’t a popular girls name at all until the 1980’s, but spiked (relatively speaking) in the 1990’s yet has declined drastically since then.

  1. Create a table showing which years it was most popular.
babynames %>%
  filter(name=="Connor") %>%
  top_n(10,prop) %>%
  arrange(-prop,n) %>%
  mutate(rank=row_number()) %>% 
  mutate(percent=round(prop*100, 1))
## # A tibble: 10 x 7
##     year sex   name       n    prop  rank percent
##    <dbl> <chr> <chr>  <int>   <dbl> <int>   <dbl>
##  1  2004 M     Connor 10047 0.00476     1     0.5
##  2  2003 M     Connor  9669 0.00460     2     0.5
##  3  2005 M     Connor  9319 0.00438     3     0.4
##  4  2002 M     Connor  8410 0.00407     4     0.4
##  5  2006 M     Connor  8669 0.00396     5     0.4
##  6  2010 M     Connor  8049 0.00392     6     0.4
##  7  2001 M     Connor  8082 0.00391     7     0.4
##  8  1998 M     Connor  7905 0.00390     8     0.4
##  9  2009 M     Connor  8098 0.00382     9     0.4
## 10  2007 M     Connor  8322 0.00376    10     0.4

Here is a table of the years where my name was most popular. The 2000’s look like the most popular time for my name, with 2004 having just over 10,000 boys be named Connor- 0.5% of all children born that year.

  1. Graph its popularity in comparison to another name or two (e.g., a friend, family member, etc.). To keep it simple, use other names of the same sex.
babynames %>%
  filter(name=="Connor"|name=="Reece"|name=="Benjamin") %>%
  filter(sex=="M") %>%
  ggplot(aes(x=year,y=prop, color=name))+
  geom_line()

I chose to make a graph showing the names of myself and my two brothers, Ben and Reece. Ben (technically named Benjamin) is easily the most historical name out of us three and has stayed fairly popular. Its popularity dipped in the 1900’s, but regained popularity in the 1970’s. My name was not popular until the 1980’s, when it really took off. Reece’s name has never been really that popular (according to this dataset), but did see a bump in the late 1990’s.