The top 5 female baby names in the first decade of the 20th century have had their proportions in the first decade of the 21st century cut by at least a half.
In order to test my hypothesis, I am looking at the top 5 females in the first decade of the 20th century. I am going to take those female names from the first decade of the 20th century and see what their proportions are in the first decade of the 21st century. There has been debate about when a decade begins and end. However, according to https://www.farmersalmanac.com/new-decade-2020-or-2021-100900 a decade begins with a year ending in 1 and ends with a year ending in 0. I will use this interpretation of a decade for my study.
Baby names that were popular in the early 20th century such as Rose and Dorothy do not seem to be popular at all in the early 21st century. There is research that indicates that extremely popular baby names tend to eventually fade away. https://www.webmd.com/baby/news/20090504/trendy-baby-names-tend-to-fade-fast The question that arises is how much do these extremely popular baby names fade away? There are likely other female baby names that were in the top 10 in the first decade of the twentieth century that have had their proportions cut by at least a half in comparison to the first decade of the twenty-first century. However, I am going to just look at the top 5 female baby names to get a good idea of how much these names have dwindled a century later.
I am not including male names in my study since boys’ names that were popular in the first decade of the twentieth century have not seen a huge drop in popularity in the twenty-first century like females have. If I included the top five male names in my study it would weaken my study. A reason why the popularity of certain boys’ names might not have changed as much as the popularity of certain female names is because males often have the same names as their fathers and grandfathers.
I wrote code below to produce results that would show what the top five female names were in the first decade of the twentieth century. I used the filter function to only show totals for female names in the first decade of the twentieth century. I used the sum function for the variable n in order to calculate the number of babies who had a top five female name in this decade.
topgirls1<-babynames %>%
filter(sex =="F" & year %in% 1901:1910) %>%
group_by(name) %>%
summarize(total =sum(n)) %>%
arrange(desc(total)) %>%
head(5) %>%
ggplot(aes(reorder(name, -total), total)) + geom_col()
topgirls1
According to the bar chart, the top five names for babies in the U.S. from 1901 to 1910 were Mary, Helen, Margaret, Anna, and Ruth. According to the baby names data set, 165,000 female babies were named Mary in this decade which was the top name. Ruth, which had the fifth highest was the name for about 50,000 female babies.
I wrote code below to generate a line graph which would show what the proportions were for the top five female baby names. The values for the proportions for each year represent what percent of female babies had each of the top five names. The filter function was used to limit the line graph to show just proportions for the female baby names of Mary, Helen, Margaret, Anna, and Ruth.
common<-babynames %>%
filter(name %in% c("Mary","Helen", "Margaret", "Anna", "Ruth") & sex =="F" & year %in% 1901:1910)
ggplot(common,aes(year, prop, colour=name)) + geom_line()
According to the line graph, Mary was by far the most popular female name in the first decade of the twentieth century with a proportion of about 0.5. In other words, according to the baby names data set about five percent of female babies in the first decade of the twentieth century were named Mary. The other top five names of Helen, Margaret, Anna, and Ruth had proportions of less than 0.3 which indicates that each name represented less than 3% of female baby names in this decade.
I wrote code below to generate a bar chart to show the top five female baby names in the first decade of the twenty-first century. I used the filter function to only show totals for female names in the first decade of the twenty-first century. I used the sum function in order to calculate the totals for each top five female name in this decade.
topgirls2<-babynames %>%
filter(sex =="F" & year==2001:2010) %>%
group_by(name) %>%
summarize(total =sum(n)) %>%
arrange(desc(total)) %>%
head(5) %>%
ggplot(aes(reorder(name, -total), total)) + geom_col()
topgirls2
According to the bar chart, none of the top five names from the first decade of the twentieth century appeared in the bar chart for the twenty-first century. I was not expecting any of the names to be the same since there are factors such as current celebrity names and culture that can impact what names parents might give their children. The top five female names in the twenty-first century were Sarah, Emma, Isabella, Elizabeth, and Victoria. Sarah was the top female name with just over 40,000.
I wrote code below to generate a line graph in order to show what the proportions were in the first decade of the twenty-first century for the top five female names. The filter function was used to limit the line graph to just show female baby names in the first decade of the twenty-first century that had the names of Mary, Helen, Margaret, Anna, and Ruth.
common2<-babynames %>%
filter(name %in% c("Mary", "Helen", "Margaret", "Anna", "Ruth") & sex =="F" & year %in% 2001:2010)
ggplot(common2,aes(year, prop, colour=name)) + geom_line()
According to the line graph, all of the top five names from the first decade of the twentieth century had their proportions cut by at least a half in the first decade of the twenty-first century. As you can see, the female baby name of Margaret had a proportion of about 0.02 in the first decade of the twentieth century, whereas in the first decade of the twenty-first century its’ proportion was around 0.001. This is a big change from the first decades of each century because the percentage of female baby names that have the name Margaret have decreased from about 2% to about 0.1%.
My hypothesis was true based on my analysis of the top five female names in the twentieth century. There is evidence based on the line graphs that the top five female names have had their proportions cut by at least a half from the first decade of the twentieth century in comparison to the first decade of the twenty-first century. I think I obtained these results since there are factors such as current celebrity names and diverse cultures that can impact what names parents might give their children. In the first decade of the twenty-first century the U.S. population was a lot more ethnically diverse than in the first decade of the twentieth century, so we could probably expect lower percentages and proportions for the top names. For example, Mary which was the top name in the first decade of the twentieth century had over 150,000, whereas Sarah which was the top name in the first decade of the twenty-first century had just over 40,000. My research could be furthured by seeing if these top five names were extremely popular female baby names over the course of a generation.