This report will focus on exploring the relationship between typical Jewish names & U.S. baby names

Something important to mention is that having a Jewish name does not mean that you are Jewish. On the counter not having a Jewish name does not mean that you are not Jewish. For this reason it is important to understand I am exploring Jewish names and not the Jewish people.

Load Required Packages

library(tidyverse)
library(rvest)
library(babynames)
library(ggthemes)
library(ggplot2)

Creating Jewish babynames datasets

To create a dataset of typical Jewish boynames by webscraping

Here is where I got the Jewish Boy Names: Popular Jewish Baby Boy Names

Here is where I got the Jewish Girl Names: Popular Jewish Baby Girl Names

These websites listed above were not a comprehensive list of Jewish boy names as there are some this list does not mention. Something else to mention that is essential is that it can be difficult to define a Jewish name because you don’t have to be Jewish to name your child a Jewish name. A great example of this is the name Michael, which is one of the most popular Jewish boynames in 2017. The name Michael is prominent in Judaism because of the Jewish phrase “mi-ka’el” which means “who is like god?” It is also popular in religions such as Christianity because of an arch-angel named Michael. For reasons like this it is very hard to identify specifically Jewish names.

boynames <- read_html("https://www.aish.com/jl/l/b/48967016.html")

boynames %>%
  html_nodes("b") %>%
  html_text() %>%
  as.data.frame() -> boynames_table

colnames(boynames_table)[1] <- "name"

boynames_table <- boynames_table %>% slice(-c(1))

babynames %>%
  inner_join(boynames_table) -> merged_boys
To create a dataset of typical Jewish girlnames by webscraping

This is not a comprehensive list of Jewish girl names as there are some this list does not mention

girlnames <- read_html("https://www.aish.com/jl/l/b/48966261.html")

girlnames %>%
  html_nodes("b") %>%
  html_text() %>%
  as.data.frame() -> girlnames_table

colnames(girlnames_table)[1] <- "name"

girlnames_table <- girlnames_table %>% slice(-c(1))

babynames %>%
  inner_join(girlnames_table) -> merged_girls
Merging the merged_boys & merged_girls together
merged_boys %>%
  full_join(merged_girls) -> all_jewishnames

Proportion of Boy Babies Born with Jewish Names Each Year

  # How to find total number of boy babies born with Jewish Names each year
merged_boys %>%
  filter(sex %in% "M") %>% 
  group_by(year) %>% 
  summarize(year_total = sum(n)) -> merged_boys_year_total
  
  # How to find the total number of boy babies each year
babynames %>%
  filter(sex %in% "M") %>% 
  group_by(year) %>% 
  summarize(year_total = sum(n)) -> babynames_boys_year_total

  # Equation to find the percentage of boy babies born with Jewish names each year 
(merged_boys_year_total$year_total)/(babynames_boys_year_total$year_total) * 100 -> prop_jewish_boys_babynames_total

  # How to add the years back into the dataset with the percentages
as.data.frame(prop_jewish_boys_babynames_total) -> boy_total

boy_total %>%
  mutate(row = (row_number() + 1879)) -> boy_total

  # Visualization for Boys
boy_total %>% 
  ggplot(aes(row, prop_jewish_boys_babynames_total)) + 
  geom_line() +
  ggtitle("Proportion of Boy Babies Born with Jewish Names per Year") -> plot_prop_boy

plot_prop_boy +
  xlab('Year') +
  ylab("Percentage of Boy Babies Born with Jewish Names") +
  theme_linedraw()

Proportion of Girl Babies Born with Jewish Names Each Year

  # How to find total number of girl babies born with Jewish Names each year
merged_girls %>%
  filter(sex %in% "F") %>% 
  group_by(year) %>% 
  summarize(year_total = sum(n)) -> merged_girls_year_total
 
 # How to find the total number of girl babies each year
babynames %>%
  filter(sex %in% "F") %>% 
  group_by(year) %>% 
  summarize(year_total = sum(n)) -> babynames_girls_year_total

  # Equation to find the percentage of girl babies born with Jewish names each year 
(merged_girls_year_total$year_total)/(babynames_girls_year_total$year_total) * 100 -> prop_jewish_girls_babynames_total

  # How to add the years back into the dataset with the percentages
as.data.frame(prop_jewish_girls_babynames_total) -> girl_total

girl_total %>%
  mutate(row = (row_number() + 1879)) -> girl_total

  # Visualization for Girls
girl_total %>% 
  ggplot(aes(row, prop_jewish_girls_babynames_total)) + 
  geom_line() +
  ggtitle("Prop. of Girl Babies Born with Jewish Names per Year") -> plot_prop_girl

plot_prop_girl +
  xlab('Year') +
  ylab("Percentage of Girl Babies Born with Jewish Names") +
  theme_linedraw()

Proportion of Babies Born with Jewish Names Each Year

  # How to find total number of babies born with Jewish Names each year
all_jewishnames %>%
  group_by(year) %>% 
  summarize(year_total = sum(n)) -> merged_allnames_year_total

  # How to find the total number of babies each year
babynames %>%
  group_by(year) %>% 
  summarize(year_total = sum(n)) -> babynames_allnames_year_total

  # Equation to find the percentage of babies born with Jewish names each year 
(merged_allnames_year_total$year_total)/(babynames_allnames_year_total$year_total) * 100 -> prop_jewish_allnames_babynames_total

  # How to add the years back into the dataset with the percentages
as.data.frame(prop_jewish_allnames_babynames_total) -> allnames_total

allnames_total %>%
  mutate(row = (row_number() + 1879)) -> allnames_total

  # Visualization for All
allnames_total %>% 
  ggplot(aes(row, prop_jewish_allnames_babynames_total)) + 
  geom_line() +
  ggtitle("Prop. of Babies Born with Jewish Names per Year") -> plot_prop_allnames

plot_prop_allnames +
  xlab('Year') +
  ylab("Percentage of Babies Born with Jewish Names") +
  theme_linedraw()

Combining the “Boys”, “Girls”, & “All” Visualizations into one

 # Combining the datasets into one
boy_total %>%
  full_join(girl_total) -> boys_girls_total

boys_girls_total %>% 
  full_join(allnames_total) -> boys_girls_all_total

  # Creating variables for the dataset columns to shorten them
boys_girls_all_total$row -> year
prop_jewish_boys_babynames_total -> prop_jboys
prop_jewish_girls_babynames_total -> prop_jgirls
prop_jewish_allnames_babynames_total -> prop_jall

  # Creating a color palet for the Legend
colors <- c("Boys" = "lightblue", "Girls" = "lightpink", "All" = "Black")

  # The actual visualization
ggplot(boys_girls_all_total, aes(x=year)) +
  geom_line(aes(y=prop_jboys, color="Boys"), size = 1.0) +
  geom_line(aes(y=prop_jgirls, color="Girls"), size = 1.0) +
  geom_line(aes(y=prop_jall, color="All"), size = 1.0) +
  labs(x = "Year",
       y = "Percentage of Babies Born with Jewish Names",
       color = "Legend") +
  scale_color_manual(values = colors) -> boys_girls_all_total_plot

boys_girls_all_total_plot +
  ggtitle("Prop. of Babies Born with Jewish Names") +
  xlab("Year") +
  ylab("Percentage of Babies Born with Jewish Names") +
  theme_linedraw()

Conclusion

After going through all the different visualizations and data I found my observations about the proportion of babies born with Jewish names per year were the most interesting. I found that the proportion of babies born with Jewish names has fluctuated greatly over the years. When comparing the differences between boys and girls the most noticeable thing is that before 1935 the proportion of boys with Jewish names was lower than girls but after it was the opposite. Another interesting observation is that Jewish boynames seemed to skyrocket to a max of 9% of the population in 1967. On the other hand Jewish girl names only reached a max of 4% of the population in 1915. In a more modern observation of the data it seems that both boy and girl Jewish names have decreased since 2000. While in the 2010s Jewish boy names has steadily decreased in proportion while Jewish girls has fluctuated.