Babynames assignment
library(babynames)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
## ✔ tibble 2.0.0 ✔ dplyr 0.7.8
## ✔ tidyr 0.8.2 ✔ stringr 1.3.1
## ✔ readr 1.3.1 ✔ forcats 0.3.0
## ── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(wordcloud2)
For this first assignment I chose to do my name, Connor.
babynames %>%
filter(year==1995) %>%
top_n(10,prop) %>%
arrange(-n) %>%
mutate(rank=row_number()) %>%
mutate(percent=round(prop*100, 1))
## # A tibble: 10 x 7
## year sex name n prop rank percent
## <dbl> <chr> <chr> <int> <dbl> <int> <dbl>
## 1 1995 M Michael 41402 0.0206 1 2.1
## 2 1995 M Matthew 32876 0.0163 2 1.6
## 3 1995 M Christopher 32673 0.0162 3 1.6
## 4 1995 M Jacob 31129 0.0155 4 1.5
## 5 1995 M Joshua 30717 0.0153 5 1.5
## 6 1995 M Nicholas 29155 0.0145 6 1.4
## 7 1995 M Tyler 29154 0.0145 7 1.4
## 8 1995 F Jessica 27935 0.0145 8 1.5
## 9 1995 M Brandon 26904 0.0134 9 1.3
## 10 1995 F Ashley 26602 0.0138 10 1.4
Here are the top 10 names for 1995. Although my first name isn’t on there, my last name (Brandon) is which I find very interesting.
babynames %>%
filter(year==1995) %>%
mutate(rank=row_number()) %>%
mutate(percent=round(prop*100, 1)) %>%
filter(name=="Connor")
## # A tibble: 2 x 7
## year sex name n prop rank percent
## <dbl> <chr> <chr> <int> <dbl> <int> <dbl>
## 1 1995 F Connor 123 0.0000640 1339 0
## 2 1995 M Connor 6616 0.00329 15808 0.3
Here is my name, ranked 15,808th in popularity the year I was born. 6,616 boys (0.3%) and 123 girls (nearly 0%) were given this name, so clearly it wasn’t too popular at the time.
babynames %>%
filter(year==1995, sex=="M") %>%
top_n(100,n) %>%
select(name,n) %>%
mutate(rank=row_number()) %>%
wordcloud2(size=.4)
Here is a word cloud of the top 100 names of boys born in 1995. The size of the name indicates its popularity- larger names are more popular, smaller names are less popular. My name is up there, but it’s not very big.
babynames %>%
filter(name=="Connor") %>%
ggplot(aes(x=year,y=prop))+
geom_line()
Here is the popularity for my name graphed over time for boys and girls.
babynames %>%
filter(name=="Connor") %>%
filter(sex=="M") %>%
ggplot(aes(x=year,y=prop))+
geom_line()
Here is my name graphed over time for just boys. Popularity took off in the 1990’s and really became popular just after the new millenium, and it has stayed relatively popular.
babynames %>%
filter(name=="Connor") %>%
filter(sex=="F") %>%
ggplot(aes(x=year,y=prop))+
geom_line()
Here is my name graphed over time for just girls. It wasn’t a popular girls name at all until the 1980’s, but spiked (relatively speaking) in the 1990’s yet has declined drastically since then.
babynames %>%
filter(name=="Connor") %>%
top_n(10,prop) %>%
arrange(-prop,n) %>%
mutate(rank=row_number()) %>%
mutate(percent=round(prop*100, 1))
## # A tibble: 10 x 7
## year sex name n prop rank percent
## <dbl> <chr> <chr> <int> <dbl> <int> <dbl>
## 1 2004 M Connor 10047 0.00476 1 0.5
## 2 2003 M Connor 9669 0.00460 2 0.5
## 3 2005 M Connor 9319 0.00438 3 0.4
## 4 2002 M Connor 8410 0.00407 4 0.4
## 5 2006 M Connor 8669 0.00396 5 0.4
## 6 2010 M Connor 8049 0.00392 6 0.4
## 7 2001 M Connor 8082 0.00391 7 0.4
## 8 1998 M Connor 7905 0.00390 8 0.4
## 9 2009 M Connor 8098 0.00382 9 0.4
## 10 2007 M Connor 8322 0.00376 10 0.4
Here is a table of the years where my name was most popular. The 2000’s look like the most popular time for my name, with 2004 having just over 10,000 boys be named Connor- 0.5% of all children born that year.
babynames %>%
filter(name=="Connor"|name=="Reece"|name=="Benjamin") %>%
filter(sex=="M") %>%
ggplot(aes(x=year,y=prop, color=name))+
geom_line()
I chose to make a graph showing the names of myself and my two brothers, Ben and Reece. Ben (technically named Benjamin) is easily the most historical name out of us three and has stayed fairly popular. Its popularity dipped in the 1900’s, but regained popularity in the 1970’s. My name was not popular until the 1980’s, when it really took off. Reece’s name has never been really that popular (according to this dataset), but did see a bump in the late 1990’s.