This report will visualize the popularity of first names appearing in the Revised English Bible compared to the first names of newborns in America from 1910 to 2014. According to Pew Researchthe percentage of Americans who consider themselves Christian has decreased by 12 percentage points. Reaching an all time low of 65%. This report hypothesizes that the proportion of biblical names will decrease over time, reflecting the changing religious landscape.
This report will focus on names found only in the Revised English Bible. No other religious texts are being examined. This also does not include any popular alternative spellings of biblical names.
library(tidyverse)
library(ggthemes)
library(rvest)
library(dplyr)
library(scales)
This report uses two primary datasets: First names of babies born in the United States split up by state, here and first names appearing in the bible, here
read.csv('StateNames.csv') -> StateNames
read_csv('https://raw.githubusercontent.com/robertrouse/theographic-bible-metadata/master/CSV/People.csv') -> BibleNames
The source of the bible names dataset included many extra columns for things such as surnames, birth year, and gender. However for this report we are only interested in the column containing the name of biblical figures. A few names are repeated within the bible so I separated all the unique names and added them into a dataframe of their own. I then added a column to the state names dataset that included a boolean value stating whether or not a given baby name was also a bible name. I then computed the number of babies born per year by state and added that to the state names table.
BibleNames$name -> List
unique(List) -> UniqueBibleNames
StateNames <- StateNames %>%
mutate(is_biblename = case_when(Name %in% UniqueBibleNames ~ "True"))
StateNames$is_biblename[is.na(StateNames$is_biblename)] <- "False"
StateNames %>%
group_by(State, Year) %>%
summarize(Population = sum(Count)) -> b_names2
StateNames %>%
left_join(b_names2, by = c("Year", "State")) ->FinalSet
Annotations have been added for context. There is no proven correlation between these events and the given data.
#Plot
FinalSet %>%
filter(is_biblename == "True") %>%
group_by(Year) %>%
summarize(Percentage = mean(sum(Count)/sum(Population))*100) %>%
arrange(desc(Year)) %>%
ggplot(aes(Year, Percentage))+geom_line()+labs(title = "The Average Proportion of Biblical Names in America 1911-2014", x = "Year", y = "Proportion") -> AvgUSProp
#Labels
AvgUSProp+
geom_segment(aes(x=1917, y=0.35, xend = 1914, yend = 0.287))+
geom_text(aes(x=1917, y = 0.354, label = "Start of WW1"), size=4)+
geom_segment(aes(x=1920, y = .2, xend = 1923, yend = .244))+
geom_text(aes(x=1920, y= .196, label = "Tell It From Calvary aires on NBC radio"))+
geom_segment(aes(x=1931, y=.3, xend = 1929, yend =.238))+
geom_text(aes(x=1931, y=.304, label = "Start of the Great Depression"), size = 4)+
geom_segment(aes(x=1939, y =0.23, xend = 1939, yend = .264 ))+
geom_text(aes(x=1939, y = 0.224, label = "Start of WW2"))+
geom_segment(aes(x = 1965, y =0.25, xend = 1965, yend = 0.201))+
geom_text(aes(x = 1965, y = 0.256, label = "US troops deployed in Vietnam"))+
geom_segment(aes(x = 1986, y = .275, xend = 1981, yend = .19))+
geom_text(aes(x = 1986, y = .281, label = "Ronald Reagan is elected"))+
geom_segment(aes(x = 1995, y = 0.23, xend = 1999, yend = 0.15))+
geom_text(aes(x = 1995, y = .236, label = "Columbine Massacre"))+
geom_segment(aes(x = 2003, y = .17, xend = 2001, yend = .143))+
geom_text(aes(x = 2003, y = .176, label = "9/11"))+
geom_segment(aes(x = 2010, y = .15, xend = 2005, yend =.128))+
geom_text(aes(x = 2010, y = .156, label = "Hurricane Katrina"))+
theme_fivethirtyeight()+
scale_color_fivethirtyeight() ->AnnotatedAvgUSProp
AnnotatedAvgUSProp
##
This graph represents the average proportion of babies born with a biblical name. As the graph shows, the proportion of given bible names has steadily decreased over time. However I want to drill down and examine one of the most religious places in the country and try to find a difference.
#Filter bible belt states
FinalSet %>%
filter(State %in% c("AL", "AR", "LA", "MS", "TN", "KY", "NC", "SC", "GA")) ->BibleBeltNames
#Plot
BibleBeltNames %>%
filter(is_biblename == "True") %>%
group_by(State, Year) %>%
summarize(Percentage = sum(Count)/Population) %>%
arrange(desc(Year)) %>%
ggplot(aes(Year, Percentage, color = State))+geom_line()+theme_fivethirtyeight()+
labs(title = "Proportion of Biblical Names in the Bible Belt 1911-2014", x = "Year", y = "Proportion") ->SmallStateProp
SmallStateProp
##
I was very intrigued by the dip in proportion around 1955 to 1975 so I drilled down further and added annotations to add some historical context to this trend.
BibleBeltNames %>%
filter(is_biblename == "True" & BibleBeltNames$Year >= 1950 & BibleBeltNames$Year <= 1975)%>%
group_by(Year) %>%
summarize(Percentage = mean(sum(Count)/Population)/10) %>%
arrange(desc(Year)) %>%
ggplot(aes(Year, Percentage))+geom_line()+theme_fivethirtyeight()+
labs(title = "Proportion of Biblical Names in the Bible Belt 1950-1975", x = "Year", y = "Proportion") ->ZoomedAvgBBProp
ZoomedAvgBBProp+
geom_segment(aes(x = 1954, y = 0.161, xend = 1953, yend = 0.174))+
geom_text(aes(x = 1953, y = 0.1756, label = "Brown v. Board of Education"))+
geom_segment(aes(x = 1955, y = 0.1585, xend = 1953, yend = 0.15))+
geom_text(aes(x = 1953, y = 0.149, label = "Emmett Till murdered"))+
geom_segment(aes(x = 1956, y = 0.152, xend = 1958, yend = 0.163))+
geom_text(aes(x = 1958, y = 0.1645, label = "Elvis' debut album"))+
geom_segment(aes(x = 1957, y = 0.1485, xend = 1955, yend = 0.138 ))+
geom_text(aes(x = 1955, y = .1375, label = "Civil Rights Act of 1957"))+
geom_segment(aes(x = 1960, y = 0.1368, xend = 1961, yend = 0.154))+
geom_text(aes(x = 1961, y = 0.155, label = "Birth control pills invented"))+
geom_segment(aes(x = 1963, y = 0.135, xend = 1965, yend = 0.145))+
geom_text(aes(x = 1965, y = 0.146, label = "I Have a Dream"))+
geom_segment(aes(x = 1969, y = 0.1453, xend = 1969, yend = 0.155))+
geom_text(aes(x = 1969, y = 0.156, label = "Apollo 11 lands")) -> AnnotatedZoomedAvgBBProp
AnnotatedZoomedAvgBBProp
##
BibleBeltNames %>%
group_by(Name, Gender) %>%
summarise(total = sum(Count)) %>%
filter(Gender == "M") %>%
arrange(desc(total)) %>%
head(10) %>%
ggplot(aes(total, reorder(Name, total), fill = Gender))+geom_col()+labs(title = "Most Popular Male Biblical Name", x = "Total", y = "Name") -> MaleTotal
MaleTotal
BibleBeltNames %>%
group_by(Name, Gender) %>%
summarise(total = sum(Count)) %>%
filter(Gender == "F") %>%
arrange(desc(total)) %>%
head(10) %>%
ggplot(aes(total, reorder(Name, total), fill = Gender))+geom_col()+labs(title = "Most Popular Female Biblical Name", x = "Total", y = "Name") -> FemaleTotal
FemaleTotal