For my project, I have decided to analyze the most popular colors that are used as baby names. I wanted to know which is the most common color for baby names along with what are the trends and patterns in this certain group of names. I hypothesize that out of the boy names I chose that Gray will be the most popular and out of the girl names I chose that Jade will be the most popular. I also believe that there will have a more recent spike of colors for baby names as more unique and different names have been shown to be more popular of the last several years.
First, I’ll install the necessary packages
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(babynames)
Observation 1
Analyzing trends of colors used as baby names for males
`geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
There was a much more obvious increase in the color names for females in the past few decades. There was also a large population of color female names in early 1900s and now this trend is increasing again.
Summary of Observation 1 & 2
Throughout these two graphs of male and baby names, I found that I was correct about there being a recent spike in baby names around the year 2000. I was incorrect about the female baby name Jade being the most popular and the most popular name recently is Violet. The most popular name in the past was Ruby. For male names the color trend has been more consistant but has also had more of an increase in the past few decades. The most popular boy name has most recently been Jade but in the past it was Clay
Observation 3
To get a different perspective on the trends in these baby names I changed the graphs to be based off of the proportion rather than n. This will show trends based off the percentage of popularity and not the count of each name.
Male names
babynames |>filter(name %in%c("Gray", "Jett", "Rory","Steel","Sterling", "Slate", "Rusty", "Jade","Auburn", "Clay")) |>filter(sex =="M") |>mutate(percent = (prop *100)) |>ggplot(aes(year,percent, color = name)) +geom_line() +ggtitle('Popularity of colors for male baby names') +xlab('Year') +ylab('Percentage')
babynames |>filter(name %in%c("Gray", "Jett", "Rory","Steel","Sterling", "Slate", "Rusty", "Jade","Auburn", "Clay")) |>filter(sex =="M") |>ggplot(aes(reorder(name, n), n, fill = name)) +geom_col() +coord_flip()
The most common color used for male names is the name Clay. I ordered and flipped this bar chart to be in order from the most popular to least popular name.
Observation 5
There was one name that was both a male and female name and that was the name Jade. I am going to look at the different of trends for the male and female name Jade.
babynames |>filter(name %in%c("Jade")) |>ggplot(aes(year,n, color = name)) +geom_line() +facet_wrap(~sex)
I found that the name Jade is incredibly more common for females then males. It seems as if these names used to be more similar however there was a large and rapid increase for the female usage of this name. I will now analyze the earlier years of this name, before the increase, to see how they compared.
In this graph it is evident that the name Jade used to be more similar in in population throughout each gender.
Conclusion
These visualizations support my hypothesis that there has been an increase in the past few decades of colors used for baby names. I was incorrect about which color name for male and female was the most common. One interesting piece of information I found throughout these visualizations was the large usage of colors for baby names around the early 1900s. With the early 1900s and early 2000s both having an increase in the usage of colors for baby names, this could be correlated to the popular 100 year baby name theory. This theory states that there is a 100 year cycle of baby names. This means that baby names that were popular 100 years ago would become popular again. The visualizations of trends and patterns in this project support this theory. In conclusion, there#has been an increase in the use of colors for baby names in the past few decades and a possible reason for this could be the 100 year baby name theory.