Introduction

FiveThirtyEight published an interesting article about Super Bowl commercials a couple years ago, entitled “According to Super Bowl Ads, Americans Love America, Animals and Sex”. They researched over 200 commercials from the 10 brands that advertised the most between the year 2000 and the year 2020 (according to superbowl-ads.com). Then they used seven boolean variables to categorize all those ads:

  1. Was the ad trying to be funny?
  2. Did the ad show the product right away?
  3. Was the ad patriotic?
  4. Did the ad feature a celebrity?
  5. Did the ad involve danger?
  6. Did the ad include animals?
  7. Did the ad use sex to sell its product?

Their insights focused mainly on specific ads that were a yes to more than one of the above questions, and the writers were particularly tickled by combinations they considered bizarre (like ads that were simultaneously trying to be funny, included animals, and used sex to sell their products).

Load the Super Bowl Ads Data

The Super Bowl ads data is loaded below, and a brief preview is displayed.

my_url <- "https://raw.githubusercontent.com/geedoubledee/superbowl-ads/main/superbowl-ads.csv"
superbowl_ads_df <- read.csv(file=my_url, header=TRUE, stringsAsFactors=FALSE)
tbl_df <- tibble::as_tibble(superbowl_ads_df)
tbl_df
## # A tibble: 244 × 11
##     year brand     superb…¹ youtu…² funny show_…³ patri…⁴ celeb…⁵ danger animals
##    <int> <chr>     <chr>    <chr>   <chr> <chr>   <chr>   <chr>   <chr>  <chr>  
##  1  2018 Toyota    https:/… https:… False False   False   False   False  False  
##  2  2020 Bud Light https:/… https:… True  True    False   True    True   False  
##  3  2006 Bud Light https:/… https:… True  False   False   False   True   True   
##  4  2018 Hynudai   https:/… https:… False True    False   False   False  False  
##  5  2003 Bud Light https:/… https:… True  True    False   False   True   True   
##  6  2020 Toyota    https:/… https:… True  True    False   True    True   True   
##  7  2020 Coca-Cola https:/… https:… True  False   False   True    False  True   
##  8  2020 Kia       https:/… https:… False False   False   True    False  False  
##  9  2020 Hynudai   https:/… https:… True  True    False   True    False  True   
## 10  2020 Budweiser https:/… https:… False True    True    True    True   False  
## # … with 234 more rows, 1 more variable: use_sex <chr>, and abbreviated
## #   variable names ¹​superbowl_ads_dot_com_url, ²​youtube_url,
## #   ³​show_product_quickly, ⁴​patriotic, ⁵​celebrity

Load the Required Packages

The below packages required for analysis are loaded.

library(knitr)
library(magrittr)
library(plyr)
library(dplyr)
library(ggplot2)

Subset the Data

Since I’m primarily interested in analyzing ads based on the seventh question (“Did the ad use sex to sell its product?”), I’ve eliminated the other boolean variable columns from the dataset. I’ve also categorized the brands according to the industry to which they belong.

superbowl_ads_df_new <- superbowl_ads_df
superbowl_ads_df_new$brand[superbowl_ads_df_new$brand == "Hynudai"] <- "Hyundai" #fixing a spelling error in the df
superbowl_ads_df_new %<>%
    select(year, brand, superbowl_ads_dot_com_url, youtube_url, use_sex)

alcohol <- superbowl_ads_df_new
alcohol %<>%
    filter(brand %in% c("Bud Light", "Budweiser")) %>%
    mutate(industry="alcohol")

soda <- superbowl_ads_df_new
soda %<>%
    filter(brand %in% c("Coca-Cola", "Pepsi")) %>%
    mutate(industry="soda")

vehicles <- superbowl_ads_df_new
vehicles %<>%
    filter(brand %in% c("Toyota", "Hyundai", "Kia")) %>%
    mutate(industry="vehicles")

sports <- superbowl_ads_df_new
sports %<>%
    filter(brand=="NFL") %>%
    mutate(industry="sports")
    
snacks <- superbowl_ads_df_new
snacks %<>%
    filter(brand=="Doritos") %>%
    mutate(industry="snacks")

banking <- superbowl_ads_df_new
banking %<>%
    filter(brand=="E-Trade") %>%
    mutate(industry="banking")

superbowl_ads_df_new <- rbind(alcohol, soda, vehicles, sports, snacks, banking)

tbl_df <- tibble::as_tibble(superbowl_ads_df_new)
tbl_df
## # A tibble: 244 × 6
##     year brand     superbowl_ads_dot_com_url             youtu…¹ use_sex indus…²
##    <int> <chr>     <chr>                                 <chr>   <chr>   <chr>  
##  1  2020 Bud Light https://superbowl-ads.com/2020-bud-l… https:… False   alcohol
##  2  2006 Bud Light https://superbowl-ads.com/2006-bud-l… https:… False   alcohol
##  3  2003 Bud Light https://superbowl-ads.com/2003-bud-l… https:… True    alcohol
##  4  2020 Budweiser https://superbowl-ads.com/2020-budwe… https:… False   alcohol
##  5  2010 Bud Light https://superbowl-ads.com/hd-exclusi… https:… True    alcohol
##  6  2007 Budweiser https://superbowl-ads.com/2007-budwe… https:… True    alcohol
##  7  2002 Budweiser https://superbowl-ads.com/2002-budwe… https:… False   alcohol
##  8  2005 Bud Light https://superbowl-ads.com/2005-bud-l… https:… True    alcohol
##  9  2004 Bud Light https://superbowl-ads.com/2004-bud-l… https:… True    alcohol
## 10  2007 Bud Light https://superbowl-ads.com/2007-bud-l… https:… False   alcohol
## # … with 234 more rows, and abbreviated variable names ¹​youtube_url, ²​industry

Exploratory Data Analysis: By Brand

Looking at the data by brand, the two things I find most interesting are both beverage-related.

First, it appears that Bud Light uses sex to sell its product in a larger percentage of its Super Bowl ads than Budweiser does. Since both brands are owned by the same company, we might ask: does the company believe sex is a more useful marketing tool among Bud Light drinkers than Budweiser drinkers?

Second, it appears that Pepsi uses sex to sell its product in a larger percentage of its Super Bowl ads than Coca-Cola does. Since these products are owned by different companies competing for (at least some of) the same market of soda-drinkers, we might ask: do these companies make their advertising decisions based on more than whether they think sex is a useful advertising tool, and what are those other considerations if so?

use_sex_false <- superbowl_ads_df_new
use_sex_false %<>%
    filter(use_sex=="False") %>%
    mutate(use_sex=0)

use_sex_true <- superbowl_ads_df_new
use_sex_true %<>%
    filter(use_sex=="True") %>%
    mutate(use_sex=1)

superbowl_ads_df_new <- rbind(use_sex_false, use_sex_true)

brand_summary <- superbowl_ads_df_new
brand_summary %<>%
    group_by(brand) %>%
    summarize(ads_using_sex=sum(use_sex),
              total_ads=sum(use_sex==1, use_sex==0),
              average_ads_using_sex=mean(use_sex)) %>%
    arrange(desc(average_ads_using_sex))
p1 <- ggplot(brand_summary,
             aes(x = reorder(brand, -average_ads_using_sex),
                 y = average_ads_using_sex,
                 fill = brand)) +
    geom_bar(stat="identity") +
    labs(x = "Brand", y = "Percentage of Ads Using Sex", title = "A Summary of Brands Using Sex in Their Super Bowl Ads") +
    ylim(0, 1) +
    scale_fill_manual(values=c("pink", "plum", "pink1", "plum1", "pink2", "plum2", "pink3", "plum3", "pink4", "plum4"))
p1

Exploratory Data Analysis: By Industry

Looking at the data by industry is interesting, but there are a few facts that make meaningful analysis difficult. Only 10 brands were researched, so some industries (e.g. snacks) are represented by only one brand, whereas others (e.g. vehicles) are represented by three brands. Also, an industry, i.e. alcohol, might be represented by two brands, but then both brands are actually owned by the same company.

industry_summary <- superbowl_ads_df_new
industry_summary %<>%
    group_by(industry) %>%
    summarize(ads_using_sex=sum(use_sex),
              total_ads=sum(use_sex==1, use_sex==0),
              average_ads_using_sex=mean(use_sex)) %>%
    arrange(desc(average_ads_using_sex))
p2 <- ggplot(industry_summary,
             aes(x = reorder(industry, -average_ads_using_sex),
                 y = average_ads_using_sex,
                 fill = industry)) +
    geom_bar(stat="identity") +
    labs(x = "Industry", y = "Percentage of Ads Using Sex", title = "A Summary of Industries Using Sex in Their Super Bowl Ads") +
    ylim(0, 1) +
    scale_fill_manual(values=c("pink", "plum", "pink1", "plum1", "pink2", "plum2"))
p2

Conclusions

It would be great to look beyond these 10 brands to expand the industry analysis that is possible regarding Super Bowl brands. It would also be paramount to note all the parent companies of the brands as the research was expanded.