Note : The data has been acquired from www.fivethirtyeight.com
Introduction will try to give an overview of all the key features about the data and what this analysis is aimed at.
This data contains a list of ads from the 10 brands that had the most advertisements in Super Bowls from 2000 to 2020, according to data from superbowl-ads.com, with matching videos found on YouTube. FiveThirtyEight staffers then came up with seven defining characteristics that are:
This analysis is aimed at loading the data from Github repository and then transforming it to make it more suitable for soft-core graphical analysis. The focus of the analysis will be:
The data set or .csv file has been loaded from github repository using the URL. After loading the data set into Rstudio the first 6 rows of data has been displayed using head() function.
library(tidyverse) #To set-up the environment
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 1.0.1
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.3.0 ✔ stringr 1.5.0
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
url <- "https://raw.githubusercontent.com/Umerfarooq122/Data_sets/main/superbowl-ads.csv"
super_bowl <- read.csv(url)
head(super_bowl) #Diplaying first 6 rows of data set
## year brand
## 1 2018 Toyota
## 2 2020 Bud Light
## 3 2006 Bud Light
## 4 2018 Hynudai
## 5 2003 Bud Light
## 6 2020 Toyota
## superbowl_ads_dot_com_url
## 1 https://superbowl-ads.com/good-odds-toyota/
## 2 https://superbowl-ads.com/2020-bud-light-seltzer-inside-posts-brain/
## 3 https://superbowl-ads.com/2006-bud-light-bear-attack/
## 4 https://superbowl-ads.com/hope-detector-nfl-super-bowl-lii-hyundai/
## 5 https://superbowl-ads.com/2003-bud-light-hermit-crab/
## 6 https://superbowl-ads.com/2020-toyota-go-places-with-cobie-smulders/
## youtube_url funny show_product_quickly
## 1 https://www.youtube.com/watch?v=zeBZvwYQ-hA False False
## 2 https://www.youtube.com/watch?v=nbbp0VW7z8w True True
## 3 https://www.youtube.com/watch?v=yk0MQD5YgV8 True False
## 4 https://www.youtube.com/watch?v=lNPccrGk77A False True
## 5 https://www.youtube.com/watch?v=ovQYgnXHooY True True
## 6 https://www.youtube.com/watch?v=f34Ji70u3nk True True
## patriotic celebrity danger animals use_sex
## 1 False False False False False
## 2 False True True False False
## 3 False False True True False
## 4 False False False False False
## 5 False False True True True
## 6 False True True True False
Using summary() fuction to get an overview of data set and to get acquintance with the data. Str() function is also used to learn about the structure of data set.
summary(super_bowl)
## year brand superbowl_ads_dot_com_url youtube_url
## Min. :2000 Length:244 Length:244 Length:244
## 1st Qu.:2005 Class :character Class :character Class :character
## Median :2010 Mode :character Mode :character Mode :character
## Mean :2010
## 3rd Qu.:2015
## Max. :2020
## funny show_product_quickly patriotic celebrity
## Length:244 Length:244 Length:244 Length:244
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## danger animals use_sex
## Length:244 Length:244 Length:244
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
str(super_bowl)
## 'data.frame': 244 obs. of 11 variables:
## $ year : int 2018 2020 2006 2018 2003 2020 2020 2020 2020 2020 ...
## $ brand : chr "Toyota" "Bud Light" "Bud Light" "Hynudai" ...
## $ superbowl_ads_dot_com_url: chr "https://superbowl-ads.com/good-odds-toyota/" "https://superbowl-ads.com/2020-bud-light-seltzer-inside-posts-brain/" "https://superbowl-ads.com/2006-bud-light-bear-attack/" "https://superbowl-ads.com/hope-detector-nfl-super-bowl-lii-hyundai/" ...
## $ youtube_url : chr "https://www.youtube.com/watch?v=zeBZvwYQ-hA" "https://www.youtube.com/watch?v=nbbp0VW7z8w" "https://www.youtube.com/watch?v=yk0MQD5YgV8" "https://www.youtube.com/watch?v=lNPccrGk77A" ...
## $ funny : chr "False" "True" "True" "False" ...
## $ show_product_quickly : chr "False" "True" "False" "True" ...
## $ patriotic : chr "False" "False" "False" "False" ...
## $ celebrity : chr "False" "True" "False" "False" ...
## $ danger : chr "False" "True" "True" "False" ...
## $ animals : chr "False" "False" "True" "False" ...
## $ use_sex : chr "False" "False" "False" "False" ...
We can see that the data set clearly requires some transfromation i.e. changing the name of columns and creating subsets in order to focus
Using rename() function to change some of column names to more appropirate and easily understanding names. For instance, “use_sex” has been changed to “explicit_content”. Similarly long names like “superbowl_ads_dot_com_url” has been changed to “superbowl_ads_url” along side some other minor changes in the names of columns. After this transformation the first 6 rows are bieng displayed again usuing head() function.
super_bowl <- rename(super_bowl, superbowl_ads_url= superbowl_ads_dot_com_url, voilence=danger, celebrity_appearance = celebrity, animal_appearance= animals, explicit_content=use_sex)
head(super_bowl)
## year brand
## 1 2018 Toyota
## 2 2020 Bud Light
## 3 2006 Bud Light
## 4 2018 Hynudai
## 5 2003 Bud Light
## 6 2020 Toyota
## superbowl_ads_url
## 1 https://superbowl-ads.com/good-odds-toyota/
## 2 https://superbowl-ads.com/2020-bud-light-seltzer-inside-posts-brain/
## 3 https://superbowl-ads.com/2006-bud-light-bear-attack/
## 4 https://superbowl-ads.com/hope-detector-nfl-super-bowl-lii-hyundai/
## 5 https://superbowl-ads.com/2003-bud-light-hermit-crab/
## 6 https://superbowl-ads.com/2020-toyota-go-places-with-cobie-smulders/
## youtube_url funny show_product_quickly
## 1 https://www.youtube.com/watch?v=zeBZvwYQ-hA False False
## 2 https://www.youtube.com/watch?v=nbbp0VW7z8w True True
## 3 https://www.youtube.com/watch?v=yk0MQD5YgV8 True False
## 4 https://www.youtube.com/watch?v=lNPccrGk77A False True
## 5 https://www.youtube.com/watch?v=ovQYgnXHooY True True
## 6 https://www.youtube.com/watch?v=f34Ji70u3nk True True
## patriotic celebrity_appearance voilence animal_appearance explicit_content
## 1 False False False False False
## 2 False True True False False
## 3 False False True True False
## 4 False False False False False
## 5 False False True True True
## 6 False True True True False
Since the analysis will be focused on ads of brands based on patriotism, animal appearances and explicitness of the content used so let’s try to make a subset of these parameters using subset() function. Again head() function has been used to view first 6 rows.
superbowl_subset <- subset(super_bowl, select = c("year", "brand","patriotic","animal_appearance","explicit_content"))
head(superbowl_subset)
## year brand patriotic animal_appearance explicit_content
## 1 2018 Toyota False False False
## 2 2020 Bud Light False False False
## 3 2006 Bud Light False True False
## 4 2018 Hynudai False False False
## 5 2003 Bud Light False True True
## 6 2020 Toyota False True False
We can Also create subset from the main data set row wise. Let’s say one wants to focus on a particular brand so we can even create a subset for that particular brand too. For Instance, let’s focus on Toyota.
superbowl_toyota <- subset(superbowl_subset, brand =="Toyota" )
head(superbowl_toyota)
## year brand patriotic animal_appearance explicit_content
## 1 2018 Toyota False False False
## 6 2020 Toyota False True False
## 66 2019 Toyota False False False
## 87 2004 Toyota False False False
## 95 2019 Toyota False False False
## 117 2014 Toyota False True False
Note: There is no use of data frame created for just for Toyota brand in the analysis it was done just show that we can create subset by row too.
Using ggplot2 to create bar graph for all the three parameters under consideration and finally comparing the results in conclusion section
ggplot(data=superbowl_subset, aes( y=brand, fill= patriotic))+
geom_bar()+theme_dark()+
labs(title = "Bar graph for brand producing patriotic ads ", x="Count", y= "Brand", fill="Patriotic")
ggplot(data=superbowl_subset, aes( y=brand, fill= animal_appearance))+
geom_bar()+theme_dark()+
labs(title = "Bar graph for brand producing ads with most animal appearances ", x="Count", y= "Brand", fill="Animal Appreance")
ggplot(data=superbowl_subset, aes( y=brand, fill= explicit_content))+
geom_bar()+theme_dark()+
labs(title = "Bar graph for brand producing explicit ads ", x="Count", y= "Brand", fill="Explicit content")