1 1) Setup

1.1 Install + load packages

# install.packages("dplyr")
# install.packages("readr")
# install.packages("lubridate")
# install.packages("knitr")
# install.packages("kableExtra")

library(dplyr)
library(readr)
library(lubridate)
library(knitr)
library(kableExtra)

2 2) Import the data

# If the CSV is in the same folder as this Rmd, this works as-is:
superbowl <- read.csv("superbowl.csv", fileEncoding = "UTF-8-BOM")

# Quick peek
head(superbowl) %>% kable(caption = "First few rows of the data")
First few rows of the data
id time new_brand week_of adspend month year volume pos neg mixed superbowl
1 1 Beetle 2-Jan-12 2.7 1 2012 2661 533 62 2066 0
1 2 Beetle 9-Jan-12 4.0 1 2012 3620 677 87 2856 0
1 3 Beetle 16-Jan-12 1.3 1 2012 4138 753 98 3287 0
1 4 Beetle 23-Jan-12 0.7 1 2012 3255 674 99 2482 0
1 5 Beetle 30-Jan-12 7105.0 1 2012 5144 1199 172 3773 1
1 6 Beetle 6-Feb-12 2576.2 2 2012 8021 2150 435 5436 1

2.0.1 What the columns mean (from the assignment)

  • new_brand: Brand name (Beetle, CR-Z, Camaro)
  • week_of: Week start date as text (we’ll convert to Date)
  • adspend: Weekly ad spend in thousands of dollars
  • volume: Total number of mentions
  • pos, neg, mixed: Positive, negative, mixed mentions
  • superbowl: 0 for weeks before the Super Bowl; 1 for the week of or after (Super Bowl week was Jan 30, 2012).

2.0.2 Convert types + basic checks

superbowl2 <- superbowl %>% 
  mutate(
    week_of = dmy(week_of),           
    new_brand = as.factor(new_brand),
    superbowl = as.integer(superbowl) 
  )

glimpse(superbowl2)
## Rows: 45
## Columns: 12
## $ id        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, …
## $ time      <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 1, 2, 3, …
## $ new_brand <fct> Beetle, Beetle, Beetle, Beetle, Beetle, Beetle, Beetle, Beet…
## $ week_of   <date> 2012-01-02, 2012-01-09, 2012-01-16, 2012-01-23, 2012-01-30,…
## $ adspend   <dbl> 2.7, 4.0, 1.3, 0.7, 7105.0, 2576.2, 41.2, 42.8, 19.8, 0.3, 1…
## $ month     <int> 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 1, 1, 1, 1, 1, …
## $ year      <int> 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, …
## $ volume    <int> 2661, 3620, 4138, 3255, 5144, 8021, 3728, 4396, 4297, 4326, …
## $ pos       <int> 533, 677, 753, 674, 1199, 2150, 881, 1021, 865, 895, 873, 95…
## $ neg       <int> 62, 87, 98, 99, 172, 435, 145, 172, 158, 148, 113, 135, 109,…
## $ mixed     <int> 2066, 2856, 3287, 2482, 3773, 5436, 2702, 3203, 3274, 3283, …
## $ superbowl <int> 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, …
summary(superbowl2$week_of)
##         Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
## "2012-01-02" "2012-01-23" "2012-02-20" "2012-02-20" "2012-03-19" "2012-04-09"

3 3) DPLYR Basics (guided)

3.0.1 Q1. Keep only the columns we care about (brand, week, adspend, buzz & sentiments).

basics_q1 <- superbowl2 %>% 
  select(new_brand, week_of, adspend, volume, pos, neg, mixed, superbowl)

basics_q1 %>% head(10) %>% kable(caption = "Selected columns (first 10 rows)")
Selected columns (first 10 rows)
new_brand week_of adspend volume pos neg mixed superbowl
Beetle 2012-01-02 2.7 2661 533 62 2066 0
Beetle 2012-01-09 4.0 3620 677 87 2856 0
Beetle 2012-01-16 1.3 4138 753 98 3287 0
Beetle 2012-01-23 0.7 3255 674 99 2482 0
Beetle 2012-01-30 7105.0 5144 1199 172 3773 1
Beetle 2012-02-06 2576.2 8021 2150 435 5436 1
Beetle 2012-02-13 41.2 3728 881 145 2702 1
Beetle 2012-02-20 42.8 4396 1021 172 3203 1
Beetle 2012-02-27 19.8 4297 865 158 3274 1
Beetle 2012-03-05 0.3 4326 895 148 3283 1

3.0.2 Q2. Filter the rows for one brand (e.g., Beetle) and arrange by week.

basics_q2 <- superbowl2 %>% 
  filter(new_brand == "Beetle") %>% 
  arrange(week_of)

basics_q2 %>% kable(caption = "Beetle only, ordered by week")
Beetle only, ordered by week
id time new_brand week_of adspend month year volume pos neg mixed superbowl
1 1 Beetle 2012-01-02 2.7 1 2012 2661 533 62 2066 0
1 2 Beetle 2012-01-09 4.0 1 2012 3620 677 87 2856 0
1 3 Beetle 2012-01-16 1.3 1 2012 4138 753 98 3287 0
1 4 Beetle 2012-01-23 0.7 1 2012 3255 674 99 2482 0
1 5 Beetle 2012-01-30 7105.0 1 2012 5144 1199 172 3773 1
1 6 Beetle 2012-02-06 2576.2 2 2012 8021 2150 435 5436 1
1 7 Beetle 2012-02-13 41.2 2 2012 3728 881 145 2702 1
1 8 Beetle 2012-02-20 42.8 2 2012 4396 1021 172 3203 1
1 9 Beetle 2012-02-27 19.8 2 2012 4297 865 158 3274 1
1 10 Beetle 2012-03-05 0.3 3 2012 4326 895 148 3283 1
1 11 Beetle 2012-03-12 1.4 3 2012 3870 873 113 2884 1
1 12 Beetle 2012-03-19 2.5 3 2012 3840 956 135 2749 1
1 13 Beetle 2012-03-26 4.3 3 2012 4240 878 109 3253 1
1 14 Beetle 2012-04-02 2.3 4 2012 3760 589 84 3087 1
1 15 Beetle 2012-04-09 0.9 4 2012 1885 311 44 1530 1

3.0.3 Q3. Create sentiment columns

Different classes define sentiment differently. We’ll calculate two common ones so you can use whichever your instructor expects:

  • Total sentiment mentions = pos + neg (ignores mixed)
  • Net sentiment = pos - neg (positive minus negative)
basics_q3 <- superbowl2 %>% 
  mutate(
    sentiment_total = pos + neg,
    net_sentiment   = pos - neg,
    pos_share   = round(100 * pos   / volume, 1),
    neg_share   = round(100 * neg   / volume, 1),
    mixed_share = round(100 * mixed / volume, 1)
  ) %>% 
  select(new_brand, week_of, volume, pos, neg, mixed, sentiment_total, net_sentiment, pos_share, neg_share, mixed_share)

basics_q3 %>% head(12) %>% 
  kable(caption = "Added sentiment columns + shares (first 12 rows)")
Added sentiment columns + shares (first 12 rows)
new_brand week_of volume pos neg mixed sentiment_total net_sentiment pos_share neg_share mixed_share
Beetle 2012-01-02 2661 533 62 2066 595 471 20.0 2.3 77.6
Beetle 2012-01-09 3620 677 87 2856 764 590 18.7 2.4 78.9
Beetle 2012-01-16 4138 753 98 3287 851 655 18.2 2.4 79.4
Beetle 2012-01-23 3255 674 99 2482 773 575 20.7 3.0 76.3
Beetle 2012-01-30 5144 1199 172 3773 1371 1027 23.3 3.3 73.3
Beetle 2012-02-06 8021 2150 435 5436 2585 1715 26.8 5.4 67.8
Beetle 2012-02-13 3728 881 145 2702 1026 736 23.6 3.9 72.5
Beetle 2012-02-20 4396 1021 172 3203 1193 849 23.2 3.9 72.9
Beetle 2012-02-27 4297 865 158 3274 1023 707 20.1 3.7 76.2
Beetle 2012-03-05 4326 895 148 3283 1043 747 20.7 3.4 75.9
Beetle 2012-03-12 3870 873 113 2884 986 760 22.6 2.9 74.5
Beetle 2012-03-19 3840 956 135 2749 1091 821 24.9 3.5 71.6

4 4) Did the Super Bowl increase buzz? (Core question)

We’ll compare weekly buzz (volume) before vs. after the Super Bowl. The indicator superbowl is already coded as 0 (before) and 1 (week of & after).

4.0.1 A) Brand‑level summary before vs. after

brand_effect <- superbowl2 %>% 
  group_by(new_brand, superbowl) %>% 
  summarise(
    n_weeks = n(),
    mean_volume = mean(volume, na.rm = TRUE),
    sd_volume   = sd(volume, na.rm = TRUE),
    mean_adspend = mean(adspend, na.rm = TRUE),
    .groups = "drop"
  ) %>% 
  arrange(new_brand, superbowl)

brand_effect %>% 
  mutate(period = ifelse(superbowl == 0, "Before SB", "SB week & After")) %>% 
  select(new_brand, period, n_weeks, mean_volume, sd_volume, mean_adspend) %>% 
  kable(caption = "Mean weekly buzz and ad spend — before vs. after the Super Bowl") %>% 
  kable_styling(full_width = FALSE)
Mean weekly buzz and ad spend — before vs. after the Super Bowl
new_brand period n_weeks mean_volume sd_volume mean_adspend
Beetle Before SB 4 3418.500 621.5092 2.1750000
Beetle SB week & After 11 4318.818 1463.3964 890.6090909
Camaro Before SB 4 87595.500 7817.3691 71.1000000
Camaro SB week & After 11 93613.273 22222.1534 517.5636364
CR-Z Before SB 4 1171.500 265.9994 1.2500000
CR-Z SB week & After 11 1486.182 711.5349 0.7090909

4.0.2 B) Compute the lift in average buzz after the Super Bowl (per brand)

library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

buzz_lift <- superbowl %>%
  group_by(new_brand, superbowl) %>%
  summarise(avg_volume = mean(volume, na.rm = TRUE), .groups = "drop") %>%
  tidyr::pivot_wider(
    names_from  = superbowl,
    values_from = avg_volume,
    names_prefix = "SB_"
  ) %>%
  mutate(
    lift     = SB_1 - SB_0,
    lift_pct = (SB_1 - SB_0) / SB_0 * 100
  )

knitr::kable(buzz_lift, digits = 1,
             caption = "Lift in average buzz after the Super Bowl (per brand)") %>%
  kableExtra::kable_styling(full_width = FALSE)
Lift in average buzz after the Super Bowl (per brand)
new_brand SB_0 SB_1 lift lift_pct
Beetle 3418.5 4318.8 900.3 26.3
CR-Z 1171.5 1486.2 314.7 26.9
Camaro 87595.5 93613.3 6017.8 6.9

4.0.3 C) Overall (all brands combined) — before vs. after

overall_effect <- superbowl2 %>% 
  group_by(superbowl) %>% 
  summarise(
    n_weeks = n(),
    mean_volume = mean(volume, na.rm = TRUE),
    sd_volume   = sd(volume, na.rm = TRUE),
    mean_adspend = mean(adspend, na.rm = TRUE),
    .groups = "drop"
  )

overall_effect %>% 
  mutate(period = ifelse(superbowl == 0, "Before SB", "SB week & After")) %>% 
  select(period, n_weeks, mean_volume, sd_volume, mean_adspend) %>% 
  kable(caption = "Overall mean weekly buzz — before vs. after")
Overall mean weekly buzz — before vs. after
period n_weeks mean_volume sd_volume mean_adspend
Before SB 12 30728.50 42209.37 24.84167
SB week & After 33 33139.42 45190.84 469.62727

6 6) Bonus views

6.0.1 Weekly trend per brand (table)

trend_tbl <- superbowl2 %>% 
  arrange(new_brand, week_of) %>% 
  group_by(new_brand) %>% 
  mutate(week_index = row_number()) %>% 
  ungroup() %>% 
  select(new_brand, week_index, week_of, adspend, volume, pos, neg, mixed, superbowl)

trend_tbl %>% head(30) %>% kable(caption = "Trend preview (first 30 rows)")
Trend preview (first 30 rows)
new_brand week_index week_of adspend volume pos neg mixed superbowl
Beetle 1 2012-01-02 2.7 2661 533 62 2066 0
Beetle 2 2012-01-09 4.0 3620 677 87 2856 0
Beetle 3 2012-01-16 1.3 4138 753 98 3287 0
Beetle 4 2012-01-23 0.7 3255 674 99 2482 0
Beetle 5 2012-01-30 7105.0 5144 1199 172 3773 1
Beetle 6 2012-02-06 2576.2 8021 2150 435 5436 1
Beetle 7 2012-02-13 41.2 3728 881 145 2702 1
Beetle 8 2012-02-20 42.8 4396 1021 172 3203 1
Beetle 9 2012-02-27 19.8 4297 865 158 3274 1
Beetle 10 2012-03-05 0.3 4326 895 148 3283 1
Beetle 11 2012-03-12 1.4 3870 873 113 2884 1
Beetle 12 2012-03-19 2.5 3840 956 135 2749 1
Beetle 13 2012-03-26 4.3 4240 878 109 3253 1
Beetle 14 2012-04-02 2.3 3760 589 84 3087 1
Beetle 15 2012-04-09 0.9 1885 311 44 1530 1
Camaro 1 2012-01-02 75.3 78030 18528 4036 55466 0
Camaro 2 2012-01-09 61.0 84868 19966 4348 60554 0
Camaro 3 2012-01-16 86.9 91640 23230 4698 63712 0
Camaro 4 2012-01-23 61.2 95844 24004 4752 67088 0
Camaro 5 2012-01-30 3977.6 98910 25140 4908 68862 1
Camaro 6 2012-02-06 272.1 100034 27568 5022 67444 1
Camaro 7 2012-02-13 59.9 94030 24234 4640 65156 1
Camaro 8 2012-02-20 72.7 101664 26762 5454 69448 1
Camaro 9 2012-02-27 88.9 107356 27746 5242 74368 1
Camaro 10 2012-03-05 202.5 108438 26236 5582 76620 1
Camaro 11 2012-03-12 132.3 104014 26190 5010 72814 1
Camaro 12 2012-03-19 105.5 92904 23048 4944 64912 1
Camaro 13 2012-03-26 534.0 107044 26418 5742 74884 1
Camaro 14 2012-04-02 165.3 85274 19522 4000 61752 1
Camaro 15 2012-04-09 82.4 30078 5732 924 23422 1

7 7) Your short answers

Q: Did the Super Bowl ad create social media buzz?
Yes, the Super Bowl ad did create social media buzz overall. All three brands (Beetle, Camaro, and CR-Z) showed an increase in average buzz (measured by volume of mentions) after the Super Bowl week compared to before. The positive lift indicates that advertising during the event successfully raised the level of online conversations.

Q: Which brand saw the largest lift?
From the lift_pct column in your brand_lift table: Beetle: ~26% increase Camaro: ~7% increase CR-Z: ~27% increase The brand with the largest relative lift was CR-Z (about +27%). Beetle was very close behind, while Camaro’s increase was smaller in percentage terms

Q: Any caveats?
- Buzz ≠ sales.
- Post-SB media coverage may also spike mentions (earned media).
- This is observational; we haven’t controlled for other campaigns or seasonality.

8 8) Reproducibility

sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.6.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Chicago
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] tidyr_1.3.1      kableExtra_1.4.0 knitr_1.50       lubridate_1.9.4 
## [5] readr_2.1.5      dplyr_1.1.4     
## 
## loaded via a namespace (and not attached):
##  [1] jsonlite_2.0.0     compiler_4.5.1     tidyselect_1.2.1   xml2_1.4.0        
##  [5] stringr_1.5.2      jquerylib_0.1.4    textshaping_1.0.3  systemfonts_1.2.3 
##  [9] scales_1.4.0       yaml_2.3.10        fastmap_1.2.0      R6_2.6.1          
## [13] generics_0.1.4     tibble_3.3.0       svglite_2.2.1      bslib_0.9.0       
## [17] pillar_1.11.1      RColorBrewer_1.1-3 tzdb_0.5.0         rlang_1.1.6       
## [21] cachem_1.1.0       stringi_1.8.7      xfun_0.53          sass_0.4.10       
## [25] viridisLite_0.4.2  timechange_0.3.0   cli_3.6.5          withr_3.0.2       
## [29] magrittr_2.0.4     digest_0.6.37      rstudioapi_0.17.1  hms_1.1.3         
## [33] lifecycle_1.0.4    vctrs_0.6.5        evaluate_1.0.5     glue_1.8.0        
## [37] farver_2.1.2       purrr_1.1.0        rmarkdown_2.30     tools_4.5.1       
## [41] pkgconfig_2.0.3    htmltools_0.5.8.1