R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

Use control+Enter to run the code chunks on PC. Use command+Enter to run the code chunks on MAC.

Load Packages

In this section, we install and load the necessary packages.

Import Data

In this section, we import the necessary data for this lab.

Homework Assignment 1: Super Bowl Caselet (Part 1)

You are hired as a Business Analyst to help the manager to find out if spending millions of dollars on Super Bowl ads creates social media buzz.

As a first step, carry out the analysis using DPLYR using superbowl.csv data. There are some questions below that you need to answer.

Dataset Exploration

# Use five functions dim(), colnames(), str(), head() and tail() to explore all the superbowl table
# observe different columns of the table and values in the rows
# type your answer below this line
dim(superbowl)
## [1] 45 12
colnames(superbowl)
##  [1] "id"        "time"      "new_brand" "week_of"   "adspend"   "month"    
##  [7] "year"      "volume"    "pos"       "neg"       "mixed"     "superbowl"
str(superbowl)
## 'data.frame':    45 obs. of  12 variables:
##  $ id       : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ time     : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ new_brand: chr  "Beetle" "Beetle" "Beetle" "Beetle" ...
##  $ week_of  : chr  "2-Jan-12" "9-Jan-12" "16-Jan-12" "23-Jan-12" ...
##  $ adspend  : num  2.7 4 1.3 0.7 7105 ...
##  $ month    : int  1 1 1 1 1 2 2 2 2 3 ...
##  $ year     : int  2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ...
##  $ volume   : int  2661 3620 4138 3255 5144 8021 3728 4396 4297 4326 ...
##  $ pos      : int  533 677 753 674 1199 2150 881 1021 865 895 ...
##  $ neg      : int  62 87 98 99 172 435 145 172 158 148 ...
##  $ mixed    : int  2066 2856 3287 2482 3773 5436 2702 3203 3274 3283 ...
##  $ superbowl: int  0 0 0 0 1 1 1 1 1 1 ...
head(superbowl)
##   id time new_brand   week_of adspend month year volume  pos neg mixed
## 1  1    1    Beetle  2-Jan-12     2.7     1 2012   2661  533  62  2066
## 2  1    2    Beetle  9-Jan-12     4.0     1 2012   3620  677  87  2856
## 3  1    3    Beetle 16-Jan-12     1.3     1 2012   4138  753  98  3287
## 4  1    4    Beetle 23-Jan-12     0.7     1 2012   3255  674  99  2482
## 5  1    5    Beetle 30-Jan-12  7105.0     1 2012   5144 1199 172  3773
## 6  1    6    Beetle  6-Feb-12  2576.2     2 2012   8021 2150 435  5436
##   superbowl
## 1         0
## 2         0
## 3         0
## 4         0
## 5         1
## 6         1
tail(superbowl)
##    id time new_brand   week_of adspend month year volume   pos  neg mixed
## 40  3   10    Camaro  5-Mar-12   202.5     3 2012 108438 26236 5582 76620
## 41  3   11    Camaro 12-Mar-12   132.3     3 2012 104014 26190 5010 72814
## 42  3   12    Camaro 19-Mar-12   105.5     3 2012  92904 23048 4944 64912
## 43  3   13    Camaro 26-Mar-12   534.0     3 2012 107044 26418 5742 74884
## 44  3   14    Camaro  2-Apr-12   165.3     4 2012  85274 19522 4000 61752
## 45  3   15    Camaro  9-Apr-12    82.4     4 2012  30078  5732  924 23422
##    superbowl
## 40         1
## 41         1
## 42         1
## 43         1
## 44         1
## 45         1
# tell R, new_brand, month and superbowl columns are factor 
superbowl$new_brand <- as.factor(superbowl$new_brand)
superbowl$month <- as.factor(superbowl$month)
superbowl$superbowl <- as.factor(superbowl$superbowl)
# tell R, week_of is date of format "%d-%b-%y"
superbowl$week_of <- as.Date(superbowl$week_of, format("%d-%b-%y"))


# you can try and experiment with different date formats here: 
# https://campus.datacamp.com/courses/intermediate-r-for-finance/dates?ex=6

Five Functions of DPLYR

# Query 1. Filter all rows where new_brand == 'Beetle'
# type your answer below this line
beetle_data <- superbowl %>% filter(new_brand == 'Beetle')


# interpret the results:
# The code displays the filtered data of the Beetle brand, revealing its adspend in the months that they had invested, making it simple for us to interrogate the data to determine our results.
################################################################################

# Query 2.Select all the columns related to buzz metrics (volume, positive and negative mentions)
# type your answer below this line
buzz_metrics <- superbowl %>% select(volume, pos, neg)


# interpret the results:
# The data provides important insights into the effectiveness of Super Bowl advertising. We investigated how social media engagement changes in response to advertising activities by using buzz indicators such as total mentions (volume), positive mentions (pos), and negative mentions (neg). Furthermore, determining the biggest ad expenditure identifies the brand that invested the most in advertising throughout various weeks. This information can assist managers in determining the relationship between ad spending and social media buzz, offering useful data for future marketing efforts.
################################################################################

# Query 3. Arrange the adspend column in descending order and see which brand has the highest ad spend and in which week_of
# type your answer below this line
 highest_adspend <- superbowl %>%
 arrange(desc(adspend))%>%
 select(new_brand, week_of, adspend)
 highest_adspend
##    new_brand    week_of adspend
## 1     Beetle 2012-01-30  7105.0
## 2     Camaro 2012-01-30  3977.6
## 3     Beetle 2012-02-06  2576.2
## 4     Camaro 2012-03-26   534.0
## 5     Camaro 2012-02-06   272.1
## 6     Camaro 2012-03-05   202.5
## 7     Camaro 2012-04-02   165.3
## 8     Camaro 2012-03-12   132.3
## 9     Camaro 2012-03-19   105.5
## 10    Camaro 2012-02-27    88.9
## 11    Camaro 2012-01-16    86.9
## 12    Camaro 2012-04-09    82.4
## 13    Camaro 2012-01-02    75.3
## 14    Camaro 2012-02-20    72.7
## 15    Camaro 2012-01-23    61.2
## 16    Camaro 2012-01-09    61.0
## 17    Camaro 2012-02-13    59.9
## 18    Beetle 2012-02-20    42.8
## 19    Beetle 2012-02-13    41.2
## 20    Beetle 2012-02-27    19.8
## 21    Beetle 2012-03-26     4.3
## 22    Beetle 2012-01-09     4.0
## 23    Beetle 2012-01-02     2.7
## 24    Beetle 2012-03-19     2.5
## 25      CR-Z 2012-01-23     2.4
## 26    Beetle 2012-04-02     2.3
## 27    Beetle 2012-03-12     1.4
## 28    Beetle 2012-01-16     1.3
## 29      CR-Z 2012-01-16     1.2
## 30      CR-Z 2012-02-13     1.2
## 31      CR-Z 2012-01-02     1.0
## 32    Beetle 2012-04-09     0.9
## 33      CR-Z 2012-01-30     0.9
## 34      CR-Z 2012-02-20     0.9
## 35      CR-Z 2012-02-27     0.8
## 36      CR-Z 2012-03-12     0.8
## 37    Beetle 2012-01-23     0.7
## 38      CR-Z 2012-02-06     0.6
## 39      CR-Z 2012-03-19     0.6
## 40      CR-Z 2012-03-05     0.5
## 41      CR-Z 2012-03-26     0.5
## 42      CR-Z 2012-04-02     0.5
## 43      CR-Z 2012-04-09     0.5
## 44      CR-Z 2012-01-09     0.4
## 45    Beetle 2012-03-05     0.3
# interpret the results:
# The results indicate that Camaro had the highest advertising spend during the week of January 30, 2012, with an expenditure of 3978 (in thousands of dollars). This significant investment highlights Camaro's focus on leveraging the Super Bowl as a marketing platform. Such high spending likely aimed to generate substantial brand visibility and social media engagement during the peak week of the event.
################################################################################

# Query 4. Create a new column called sentiment which is a sum of positive (column name is pos) and negative (column name is neg) mentions,
# and select only three columns for display, week_of, new_brand and sentiment
# type your answer below this line
 sentiment_data <- superbowl %>%
 mutate(sentiment = pos+neg) %>%
 select(week_of, new_brand, sentiment)
 sentiment_data
##       week_of new_brand sentiment
## 1  2012-01-02    Beetle       595
## 2  2012-01-09    Beetle       764
## 3  2012-01-16    Beetle       851
## 4  2012-01-23    Beetle       773
## 5  2012-01-30    Beetle      1371
## 6  2012-02-06    Beetle      2585
## 7  2012-02-13    Beetle      1026
## 8  2012-02-20    Beetle      1193
## 9  2012-02-27    Beetle      1023
## 10 2012-03-05    Beetle      1043
## 11 2012-03-12    Beetle       986
## 12 2012-03-19    Beetle      1091
## 13 2012-03-26    Beetle       987
## 14 2012-04-02    Beetle       673
## 15 2012-04-09    Beetle       355
## 16 2012-01-02      CR-Z       300
## 17 2012-01-09      CR-Z       274
## 18 2012-01-16      CR-Z       458
## 19 2012-01-23      CR-Z       282
## 20 2012-01-30      CR-Z       518
## 21 2012-02-06      CR-Z       652
## 22 2012-02-13      CR-Z       320
## 23 2012-02-20      CR-Z       536
## 24 2012-02-27      CR-Z       498
## 25 2012-03-05      CR-Z       338
## 26 2012-03-12      CR-Z       316
## 27 2012-03-19      CR-Z       374
## 28 2012-03-26      CR-Z       332
## 29 2012-04-02      CR-Z       302
## 30 2012-04-09      CR-Z        92
## 31 2012-01-02    Camaro     22564
## 32 2012-01-09    Camaro     24314
## 33 2012-01-16    Camaro     27928
## 34 2012-01-23    Camaro     28756
## 35 2012-01-30    Camaro     30048
## 36 2012-02-06    Camaro     32590
## 37 2012-02-13    Camaro     28874
## 38 2012-02-20    Camaro     32216
## 39 2012-02-27    Camaro     32988
## 40 2012-03-05    Camaro     31818
## 41 2012-03-12    Camaro     31200
## 42 2012-03-19    Camaro     27992
## 43 2012-03-26    Camaro     32160
## 44 2012-04-02    Camaro     23522
## 45 2012-04-09    Camaro      6656
# interpret the results:
#The data shows that Super Bowl advertising considerably increased social media conversation, as indicated by a surge in sentiment scores for Beetle and Camaro during and after the game (January 30, 2012). Beetle's sentiment scores climbed significantly from pre-Super Bowl levels, showing effective campaign-driven engagement, whereas Camaro, which already had high sentiment, received additional amplification during this timeframe. In contrast, CR-Z exhibited little change in attitude, indicating less effective advertising. Overall, the research suggests that investing in Super Bowl ads can effectively increase social media engagement, however performance varies by brand and campaign execution.