Has the average age of Oscar winners changed over time?
The dataset I am working with is called “Oscars.” This dataset comes from the Journal of Statistical Education and the data repository OpenIntro.org. It includes information about the winners of the Best Actress and Best Actor categories from 1932 to 2019. One of the drawbacks I found in this dataset was the limited data regarding other categories of the Oscars. The Academy Awards, commonly known as the Oscars, are among the most prestigious awards an actor or actress can receive. There are currently 23 award categories, including Best Original Song, Best Director, and Best Supporting Actress or Actor, among others.
For this analysis, I will focus on two main variables: “oscar_yr,” which represents the year the actor or actress received their award, and “age,” which indicates how old the winner was at that time. By examining these variables, I aim to determine whether there has been a noticeable trend or change in the average age of Oscar winners over the decades. This analysis could reveal interesting patterns about whether younger or older actors have become more likely to win an Oscar, or if there is an element of unspoken ageism within Hollywood’s award industry.
The type of data analysis I am going to perform involves first checking the structure of the dataset to identify which classes each variable belongs to. I will then use the head() function to view the top six rows of the dataset and gain an initial understanding of the data. After reviewing the dataset, I concluded that no data cleaning is necessary to answer my research question. To gather the information needed for my analysis, I will create a new variable for decades to categorize the years into specific time periods. Then, I will group the data by decade and calculate the average age of winners within each group. Finally, to present the data I have gathered and analyzed, I will create a bar plot that clearly displays the results. I chose to use a bar plot because it allows easy comparison of mean ages across decades.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ readr 2.1.5
## ✔ ggplot2 3.5.1 ✔ stringr 1.5.1
## ✔ lubridate 1.9.4 ✔ tibble 3.2.1
## ✔ purrr 1.0.2 ✔ tidyr 1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
oscars <- read.csv("oscars.csv")
str(oscars)
## 'data.frame': 184 obs. of 11 variables:
## $ oscar_no : int 1 2 3 4 5 6 7 8 9 10 ...
## $ oscar_yr : int 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 ...
## $ award : chr "Best actress" "Best actress" "Best actress" "Best actress" ...
## $ name : chr "Janet Gaynor" "Mary Pickford" "Norma Shearer" "Marie Dressler" ...
## $ movie : chr "7th Heaven" "Coquette" "The Divorcee" "Min and Bill" ...
## $ age : int 22 37 28 63 32 26 31 27 26 27 ...
## $ birth_pl : chr "Pennsylvania" "Canada" "Canada" "Canada" ...
## $ birth_date: chr "1906-10-06" "1892-04-08" "1902-08-10" "1868-11-09" ...
## $ birth_mo : int 10 4 8 11 10 5 9 4 1 1 ...
## $ birth_d : int 6 8 10 9 10 12 13 5 12 12 ...
## $ birth_y : int 1906 1892 1902 1868 1900 1907 1903 1908 1910 1910 ...
head(oscars)
## oscar_no oscar_yr award name movie
## 1 1 1929 Best actress Janet Gaynor 7th Heaven
## 2 2 1930 Best actress Mary Pickford Coquette
## 3 3 1931 Best actress Norma Shearer The Divorcee
## 4 4 1932 Best actress Marie Dressler Min and Bill
## 5 5 1933 Best actress Helen Hayes The Sin of Madelon Claudet
## 6 6 1934 Best actress Katharine Hepburn Morning Glory
## age birth_pl birth_date birth_mo birth_d birth_y
## 1 22 Pennsylvania 1906-10-06 10 6 1906
## 2 37 Canada 1892-04-08 4 8 1892
## 3 28 Canada 1902-08-10 8 10 1902
## 4 63 Canada 1868-11-09 11 9 1868
## 5 32 Washington DC 1900-10-10 10 10 1900
## 6 26 Connecticut 1907-05-12 5 12 1907
oscars_decades <- oscars |>
mutate(decade = case_when(
oscar_yr <= 1939 ~ "1930s",
oscar_yr <= 1949 ~ "1940s",
oscar_yr <= 1959 ~ "1950s",
oscar_yr <= 1969 ~ "1960s",
oscar_yr <= 1979 ~ "1970s",
oscar_yr <= 1989 ~ "1980s",
oscar_yr <= 1999 ~ "1990s",
oscar_yr <= 2009 ~ "2000s",
oscar_yr <= 2019 ~ "2010s",
oscar_yr >= 2020 ~ "2020s"
))
oscars_decades_mean <- oscars_decades |>
group_by(decade) |>
summarise(mean_age = mean(age))
colSums(is.na(oscars_decades_mean))
## decade mean_age
## 0 0
plot <- ggplot(oscars_decades_mean, aes(x = decade, y = mean_age, fill = mean_age)) +
geom_bar(stat = "identity" ,color = "black") +
labs(
title = "Average Age by Decade",
x = "Decade",
y = "Mean Age"
) +
theme(legend.position = "none")
plot
Overall, after completing my research, I have come to the conclusion that the average age of Oscar winners has not significantly changed throughout the decades. The data shows that the average winning age has ranged between 36.25 and 44.4 years old, suggesting that the Academy tends to recognize actors and actresses who may already be well-established in their careers. This stability over time indicates that factors such as age may not play as large of a role in determining Oscar success as some might assume.
However, there are still several interesting directions for future research and analysis. For example, one potential area to explore could be whether certain birth months produce more Oscar winners than others, which could lead to discussions about astrology. Another potential area to explore is whether age trends differ between genders. Expanding the analysis could provide a deeper understanding of the trends and patterns behind one of the most prestigious awards in the film industry.
Journal of Statistical Education, http://jse.amstat.org/datasets/oscars.dat.txt, updated through 2019 using information from Oscars.org and Wikipedia.org.
Wikipedia Contributors. “Academy Awards.” Wikipedia, Wikimedia Foundation, 2 May 2019, en.wikipedia.org/wiki/Academy_Awards.