library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0 ✔ purrr 0.2.5
## ✔ tibble 2.0.0 ✔ dplyr 0.7.8
## ✔ tidyr 0.8.2 ✔ stringr 1.3.1
## ✔ readr 1.3.1 ✔ forcats 0.3.0
## ── Conflicts ─────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(wordcloud2)
library(babynames)
This table represnts the rank of the name Autumn in 1992. As you can see, this name is ranked low in terms of percentage, which meant that fewer individuals had that name.
babynames %>%
filter(year == 1992, sex == "F") %>%
mutate(rank = row_number()) %>%
mutate(percent = round(prop * 100, 1)) %>%
filter(name == "Autumn")
This word cloud represents the top 100 names for girls in 1992. It is important to note that Autumn did not make the top 100. You can cross reference this information from the table above.
babynames %>%
filter(year == 1992) %>% # use only one year
filter(sex == "F") %>% # use only one sex
select(name, n) %>% # select the two relevant variables: the name and how often it occurs
top_n(100, n) %>% # use only the top names or it could get too big
wordcloud2(size = .5) # generate the word cloud at a font size of .5
This line graph depicts the popularity of the name Autumn over time. Since about 1975, this name has increased in popularity and has remained constant since the early 1990s. However, more data would be needed to determine whether or not the popularity will continue to rise, decline or remain the same.
babynames %>% # start with the data
filter(name == "Autumn", sex == "F") %>% # choose the name and sex
mutate(percent = round(prop * 100, 1)) %>% # create a new variable called percent
ggplot(aes(x = year, y = percent)) + # put year on the x-axis and prop (proportion) on y
geom_line(color = "purple") # make it a line graph and give the line a color
This table depicts the most popular years for the name Autumn. It is interesting to note that the orignal birth year did not make the top 10.
babynames %>% # Start with the dataset
filter(name == "Autumn", sex == "F") %>% # only look at the name and sex you want
top_n(10, prop) %>% # get the top 10 names
arrange(-prop) # sort in descending order
This line graph represents a comparison of the popularity of the name Autumn with two other names of her peers. Emily was extremely popular before 2000-2005 but this name started to decline after 2010. Katie and Autumn remained relative similar in popularity with a more definitive decline in popularityfor Kaite after 2010.
babynames %>%
filter(name == "Autumn" | name == "Emily" | name == "Katie") %>%
filter(year > 1992) %>%
filter(sex == "F") %>%
ggplot(aes(x = year, y = n, color = name)) +
geom_line()