The Rise and Fall of the ending ie: A Study of Trends in Female Baby Names

Author

Sophie Armstrong

Female baby names ending in “-ie”

When looking at a trend surrounding baby names, I was inspired to look into the popularity of female baby names ending in “-ie”. To keep the data more consice, I limited the number of top names to 10, as I knew later on I would delve more into information surrounding these names. My initial theory was that over time, the top female baby names ending in “-ie” would change somewhat drastically between each decade, leading to a new set of names throughout each graph. I also believed that the average letter length of the names would become shorter.

Running Code

In order to complete the project, it was important to include and run the necessary libraries.

library(babynames)
library(tidyverse)
library(plotly)

I first began by graphing the top ten most popular female baby names ending in “-ie” over time.

babynames |> 
  mutate(first_letter = str_sub(name, 1,1)) -> baby_first_letter

substrRight <- function(x, n){
  substr(x, nchar(x)-n+1, nchar(x))
}

babynames %>% 
  mutate(last_two = substrRight(name,2)) -> babynames_two

babynames_two |> 
  filter(sex=="F") |> 
  filter(last_two %in% "ie") |> 
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(10) -> ie

babynames |> 
  filter(sex=="F" & name %in% ie$name) |> 
  ggplot(aes(year, prop, color = name)) + geom_line() -> plot1
ggplotly(plot1)

To delve more into the other information I was curious pertaining to name popularity and length, I decided to create a graph that examines the top ten female baby names ending in “-ie” for each generation, spanning from the Lost Generation to Generation Z. If Generation Alpha had the same amount of data, it would have been included. In order to ensure that all data was ethically represented, I chose to exclude Generation Alpha from my secondary research.

babynames |> 
  filter(year > 1883 & year < 1900) -> lostgen

options(scipen=100000)
lostgen |> 
  filter(sex=="F") |> 
  filter(sex=="F" & name %in% ie$name) |> 
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(10) |> 
  ggplot(aes(name, total, fill = total)) + geom_col() +
  coord_flip()+
  ggtitle('-ie the Lost Generation') +
  xlab('Name') +
  ylab ('Total')

The graph above reveals the most popular female baby names ending in “-ie” between the years of 1883 and 1900 were Annie and Marie. The fellow eight names, Stephanie, Natalie, Marjorie, Leslie, Julie, Jamie, Connie, and Bonnie were significantly less popular.

babynames |> 
  filter(year > 1901 & year < 1927) -> greatest 

options(scipen=100000)
greatest |> 
  filter(sex=="F") |> 
  filter(sex=="F" & name %in% ie$name) |> 
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(10) |> 
  ggplot(aes(name, total, fill = total)) + geom_col() +
  coord_flip()+
  ggtitle('-ie: the Greatest Generation') +
  xlab('Name') +
  ylab ('Total')

The graph above reveals the most popular female baby names ending in “-ie” between the years of 1901 and 1927 were Marjorie, Marie, and Annie. The fellow seven names, Stephanie, Natalie, Leslie, Julie, Jamie, Connie, and Bonnie were significantly less popular.

babynames |> 
  filter(year > 1928 & year < 1945) -> silent

options(scipen=100000)
silent |> 
  filter(sex=="F") |> 
  filter(sex=="F" & name %in% ie$name) |> 
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(10) |> 
  ggplot(aes(name, total, fill=total)) + geom_col() +
  coord_flip()+
  ggtitle('-ie: the Silent Generation') +
  xlab('Name') +
  ylab ('Total')

The graph above reveals the most popular female baby names ending in “-ie” between the years of 1928 and 1945 were Annie, Bonnie, Marie, and Marjorie. The fellow six names, Stephanie, Natalie, Leslie, Julie, Jamie, and Connie were significantly less popular.

babynames |> 
  filter(year > 1946 & year < 1964) -> babyboomers

options(scipen=100000)
babyboomers |> 
  filter(sex=="F") |> 
  filter(sex=="F" & name %in% ie$name) |> 
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(10) |> 
  ggplot(aes(name, total, fill=total)) + geom_col() +
  coord_flip()+
  ggtitle('-ie: the Babyboomer Generation') +
  xlab('Name') +
  ylab ('Total')

The graph above reveals the most popular female baby names ending in “-ie” between the years of 1946 and 1964 were Stephanie, Marie, Leslie, Julie, Connie, and Bonnie. The fellow four names, Natalie, Marjorie, Jamie, and Annie, were significantly less popular.

babynames |> 
  filter(year > 1965 & year < 1980) -> genX

options(scipen=100000)
genX |> 
  filter(sex=="F") |> 
  filter(sex=="F" & name %in% ie$name) |> 
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(10) |> 
  ggplot(aes(name, total, fill=total)) + geom_col() +
  coord_flip()+
  ggtitle('-ie: Generation X') +
  xlab('Name') +
  ylab ('Total')

The graph above reveals the most popular female baby names ending in “-ie” between the years of 1965 and 1980 were Jamie, Julie, Natalie, and Stephanie. The fellow six names, Natalie, Marjorie, Marie, Connie, Bonnie, and Annie, were significantly less popular.

babynames |> 
  filter(year > 1981 & year < 1996) -> genY

options(scipen=100000)
genY |> 
  filter(sex=="F") |> 
  filter(sex=="F" & name %in% ie$name) |> 
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(10) |> 
  ggplot(aes(name, total, fill=total)) + geom_col() +
  coord_flip()+
  ggtitle('-ie: Generation Y') +
  xlab('Name') +
  ylab ('Total')

The graph above reveals the most popular female baby names ending in “-ie” between the years of 1981 and 1996 were Jamie, Julie, Natalie, and Stephanie. The fellow six names were Majorie, Marie, Leslie, Connie, Bonnie, and Annie.

babynames |> 
  filter(year > 1997 & year < 2012) -> genZ

options(scipen=100000)
genZ |> 
  filter(sex=="F") |> 
  filter(sex=="F" & name %in% ie$name) |> 
  group_by(name) |> 
  summarize(total = sum(n)) |> 
  arrange(desc(total)) |> 
  head(10) |> 
  ggplot(aes(name, total, fill=total)) + geom_col() +
  coord_flip()+
  ggtitle('-ie: Generation Z') +
  xlab('Name') +
  ylab ('Total')

The graph above reveals the most popular female baby names ending in “-ie” between the years of 1997 and 2012 were Stephanie and Natalie. The fellow eight names were Marjorie, Marie, Leslie, Julie, Jamie, Connie, Bonnie, and Annie.

After compiling all of my data for the most popular female baby names ending in “-ie” in each generation, I was very surprised to learn that the names remained as the same ten names. These names are Stephanie, Natalie, Marjorie, Marie, Leslie, Julie, Jamie, Connie, Bonnie, and Annie. was expecting at least a few, if not most, of the names to change throughout the generations. This was obviously not the case.

In order to break down the average length, I chose to calculate the average length of the names that surpassed the 50,000 total mark. This indicates that during that generation, at least 50,000 people were given this name, making it somewhat more popular than the other names included.

library(ggplot2)


data <- data.frame(
  year = seq(1880, 2017, length.out = 7),  
  value = c(5.5, 6.3, 6.25, 6.1, 6.25, 6.75, 8.5)  
)


ggplot(data, aes(x = year, y = value)) +
  geom_line(color = "blue", size = 1) + 
  geom_point(color = "red", size = 3) +  
  labs(title = "the Average Letter Length of Female Baby Names Ending in ie",
       x = "Year",
       y = "Name Letter Length Average") 
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Above the graph clearly displays an increase in the average letter length of female baby names ending in “-ie” over the course of time.

After collecting my data and analyzing my results, it is clear that my initial hypothesis was incorrect. I now have a clear understanding that the most popular female baby names ending in “-ie” remain rather consistent throughout the last few decades. The only difference is the fluctuating popularity in these ten names, but not the set of names. I can also see a rise in name length average, rather than a decrease. I found this information to be very interesting and surprising. I also understand that very limited research was completed for this project, leading to possible errors in information and accuracy.