The Film Industry’s Effect on Names Given to Babies

12/08/2017

Project Details

Task

Create a graph showing the ups and downs in the popularity of names of interest.

How?

Popularity can't be determined only by how many babies were given a name.
Find the percentage of babies given a certain name out of all babies born in a particular year.

Elaborate

Prince's name became popular after his album Purple rain.
To what extent does pop culture influence how people name their children?
Film industry has a huge influence.
New task involves graphing each name chosen for its respective decade and a vertical line showing when the movie was released.

Analysis

Names/Movies:

50's - Peter Pan: "Wendy"

60's - Breakfast at Tiffany's: "Tiffany"

70's - Logan: "Logan"

80's - 16 Candles: "Samantha"

90's - Pulp Fiction: "Mia"

Analysis

This assigns a name to a year so that we can see the x intercept of each name associated with the year that the movie was released.

movieName <- c("Wendy", "Tiffany", "Logan","Samantha", "Mia" )
ym <- data.frame(year_movie = c(1953, 1961, 1976, 1984, 1994),
                 name = movieName)

Analysis

In order to show which name the movie came from on the graph, we have to reassign names/labels.

movie_names <- list(
  'Wendy'="Peter Pan: Wendy",
  'Tiffany'="Breakfast at Tiffany's: Tiffany",
  'Logan'="Logan's Run: Logan",
  'Samantha'="16 Candles: Samantha",
  'Mia'="Pulp Fiction: Mia"
)

movie_labeller <- function(variable, value){
  return(movie_names[value])
}

Analysis

To find the percentage for the names of interest we will have to create some new tables that include the variables we need.

Variable that shows the total number of babies born in each year: totalBabies.
Variable that is the number of babies given each name within each year: totalName.
New data table that contains: name, year, year_movie, nameTotal, and total.

Analysis

totalBabies <-
  BabyNames %>%
  group_by(year) %>%
  summarise(total = sum(count))

totalName <-
  BabyNames %>%
  group_by(name, year)%>%
  summarise(nameTotal=sum(count))

PopNames <-
  totalName %>%
  inner_join(totalBabies) %>%
  inner_join(ym)

PopNames %>%
  filter(name %in% movieName) %>%

  group_by(name, year, year_movie, nameTotal, total) %>%
  
  summarise(namepercent = ((nameTotal/total)*1000)) %>%

  ggplot(aes(x = year, y = namepercent)) +

  ggtitle("Percent of Babies Given Names Popularized by Movies")+
  theme(text = element_text(size=11.5), plot.title=element_text(hjust=0.5))+

  geom_line() +
  geom_vline(aes(xintercept = year_movie)) +

  labs(x = "Year", y = "Proportion Out of 1000")+
  facet_wrap(~ name, labeller=movie_labeller)+ 

  scale_x_continuous(limits = c(1940, 2010), breaks=seq(1940,2010,10))

Discussion

In some cases the film industry greatly impacts the names given to babies.
Manual selection of names leaves room for a more in depth study.

Future Study

Let R find the popular movies and popular names instead of doing it manually.
Scrape the Internet Movie Database (IMDb) to collect popular movies from each decade and then collect the names of important characters in each movie.
Join to the BabyNames data table.
Create something that analyzed each of those names from each movie in each decade and decided how many of them became "popular."
Because popular is not a numerical variable, decide what you will count as popular.
Check the before and after of the movie release in accordance with "popular."

Project Details

Task

How?

Elaborate

Analysis

Names/Movies:

Analysis

Analysis

Analysis

Analysis

Discussion

Discussion

Future Study

Questions?