FiveThirtyEight published an interesting article about Super Bowl commercials a couple years ago, entitled “According to Super Bowl Ads, Americans Love America, Animals and Sex”. They researched over 200 commercials from the 10 brands that advertised the most between the year 2000 and the year 2020 (according to superbowl-ads.com). Then they used seven boolean variables to categorize all those ads:
Their insights focused mainly on specific ads that were a yes to more than one of the above questions, and the writers were particularly tickled by combinations they considered bizarre (like ads that were simultaneously trying to be funny, included animals, and used sex to sell their products).
The Super Bowl ads data is loaded below, and a brief preview is displayed.
my_url <- "https://raw.githubusercontent.com/geedoubledee/superbowl-ads/main/superbowl-ads.csv"
superbowl_ads_df <- read.csv(file=my_url, header=TRUE, stringsAsFactors=FALSE)
tbl_df <- tibble::as_tibble(superbowl_ads_df)
tbl_df
## # A tibble: 244 × 11
## year brand superb…¹ youtu…² funny show_…³ patri…⁴ celeb…⁵ danger animals
## <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 2018 Toyota https:/… https:… False False False False False False
## 2 2020 Bud Light https:/… https:… True True False True True False
## 3 2006 Bud Light https:/… https:… True False False False True True
## 4 2018 Hynudai https:/… https:… False True False False False False
## 5 2003 Bud Light https:/… https:… True True False False True True
## 6 2020 Toyota https:/… https:… True True False True True True
## 7 2020 Coca-Cola https:/… https:… True False False True False True
## 8 2020 Kia https:/… https:… False False False True False False
## 9 2020 Hynudai https:/… https:… True True False True False True
## 10 2020 Budweiser https:/… https:… False True True True True False
## # … with 234 more rows, 1 more variable: use_sex <chr>, and abbreviated
## # variable names ¹superbowl_ads_dot_com_url, ²youtube_url,
## # ³show_product_quickly, ⁴patriotic, ⁵celebrity
The below packages required for analysis are loaded.
library(knitr)
library(magrittr)
library(plyr)
library(dplyr)
library(ggplot2)
Since I’m primarily interested in analyzing ads based on the seventh question (“Did the ad use sex to sell its product?”), I’ve eliminated the other boolean variable columns from the dataset. I’ve also categorized the brands according to the industry to which they belong.
superbowl_ads_df_new <- superbowl_ads_df
superbowl_ads_df_new$brand[superbowl_ads_df_new$brand == "Hynudai"] <- "Hyundai" #fixing a spelling error in the df
superbowl_ads_df_new %<>%
select(year, brand, superbowl_ads_dot_com_url, youtube_url, use_sex)
alcohol <- superbowl_ads_df_new
alcohol %<>%
filter(brand %in% c("Bud Light", "Budweiser")) %>%
mutate(industry="alcohol")
soda <- superbowl_ads_df_new
soda %<>%
filter(brand %in% c("Coca-Cola", "Pepsi")) %>%
mutate(industry="soda")
vehicles <- superbowl_ads_df_new
vehicles %<>%
filter(brand %in% c("Toyota", "Hyundai", "Kia")) %>%
mutate(industry="vehicles")
sports <- superbowl_ads_df_new
sports %<>%
filter(brand=="NFL") %>%
mutate(industry="sports")
snacks <- superbowl_ads_df_new
snacks %<>%
filter(brand=="Doritos") %>%
mutate(industry="snacks")
banking <- superbowl_ads_df_new
banking %<>%
filter(brand=="E-Trade") %>%
mutate(industry="banking")
superbowl_ads_df_new <- rbind(alcohol, soda, vehicles, sports, snacks, banking)
tbl_df <- tibble::as_tibble(superbowl_ads_df_new)
tbl_df
## # A tibble: 244 × 6
## year brand superbowl_ads_dot_com_url youtu…¹ use_sex indus…²
## <int> <chr> <chr> <chr> <chr> <chr>
## 1 2020 Bud Light https://superbowl-ads.com/2020-bud-l… https:… False alcohol
## 2 2006 Bud Light https://superbowl-ads.com/2006-bud-l… https:… False alcohol
## 3 2003 Bud Light https://superbowl-ads.com/2003-bud-l… https:… True alcohol
## 4 2020 Budweiser https://superbowl-ads.com/2020-budwe… https:… False alcohol
## 5 2010 Bud Light https://superbowl-ads.com/hd-exclusi… https:… True alcohol
## 6 2007 Budweiser https://superbowl-ads.com/2007-budwe… https:… True alcohol
## 7 2002 Budweiser https://superbowl-ads.com/2002-budwe… https:… False alcohol
## 8 2005 Bud Light https://superbowl-ads.com/2005-bud-l… https:… True alcohol
## 9 2004 Bud Light https://superbowl-ads.com/2004-bud-l… https:… True alcohol
## 10 2007 Bud Light https://superbowl-ads.com/2007-bud-l… https:… False alcohol
## # … with 234 more rows, and abbreviated variable names ¹youtube_url, ²industry
Looking at the data by brand, the two things I find most interesting are both beverage-related.
First, it appears that Bud Light uses sex to sell its product in a larger percentage of its Super Bowl ads than Budweiser does. Since both brands are owned by the same company, we might ask: does the company believe sex is a more useful marketing tool among Bud Light drinkers than Budweiser drinkers?
Second, it appears that Pepsi uses sex to sell its product in a larger percentage of its Super Bowl ads than Coca-Cola does. Since these products are owned by different companies competing for (at least some of) the same market of soda-drinkers, we might ask: do these companies make their advertising decisions based on more than whether they think sex is a useful advertising tool, and what are those other considerations if so?
use_sex_false <- superbowl_ads_df_new
use_sex_false %<>%
filter(use_sex=="False") %>%
mutate(use_sex=0)
use_sex_true <- superbowl_ads_df_new
use_sex_true %<>%
filter(use_sex=="True") %>%
mutate(use_sex=1)
superbowl_ads_df_new <- rbind(use_sex_false, use_sex_true)
brand_summary <- superbowl_ads_df_new
brand_summary %<>%
group_by(brand) %>%
summarize(ads_using_sex=sum(use_sex),
total_ads=sum(use_sex==1, use_sex==0),
average_ads_using_sex=mean(use_sex)) %>%
arrange(desc(average_ads_using_sex))
p1 <- ggplot(brand_summary,
aes(x = reorder(brand, -average_ads_using_sex),
y = average_ads_using_sex,
fill = brand)) +
geom_bar(stat="identity") +
labs(x = "Brand", y = "Percentage of Ads Using Sex", title = "A Summary of Brands Using Sex in Their Super Bowl Ads") +
ylim(0, 1) +
scale_fill_manual(values=c("pink", "plum", "pink1", "plum1", "pink2", "plum2", "pink3", "plum3", "pink4", "plum4"))
p1
Looking at the data by industry is interesting, but there are a few facts that make meaningful analysis difficult. Only 10 brands were researched, so some industries (e.g. snacks) are represented by only one brand, whereas others (e.g. vehicles) are represented by three brands. Also, an industry, i.e. alcohol, might be represented by two brands, but then both brands are actually owned by the same company.
industry_summary <- superbowl_ads_df_new
industry_summary %<>%
group_by(industry) %>%
summarize(ads_using_sex=sum(use_sex),
total_ads=sum(use_sex==1, use_sex==0),
average_ads_using_sex=mean(use_sex)) %>%
arrange(desc(average_ads_using_sex))
p2 <- ggplot(industry_summary,
aes(x = reorder(industry, -average_ads_using_sex),
y = average_ads_using_sex,
fill = industry)) +
geom_bar(stat="identity") +
labs(x = "Industry", y = "Percentage of Ads Using Sex", title = "A Summary of Industries Using Sex in Their Super Bowl Ads") +
ylim(0, 1) +
scale_fill_manual(values=c("pink", "plum", "pink1", "plum1", "pink2", "plum2"))
p2
It would be great to look beyond these 10 brands to expand the industry analysis that is possible regarding Super Bowl brands. It would also be paramount to note all the parent companies of the brands as the research was expanded.