Introduction

If you remember being a kid on Halloween, you’ll remember that fun-sized candy was a staple at most houses. But, what makes for the best candy? FiveThirtyEight’s The Ultimate Halloween Candy Power Ranking seeks to determine the best candy as well as determine the driving factors behind each fun-sized bar’s popularity.

library(tidyverse)

## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --

## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.0.6     v dplyr   1.0.3
## v tidyr   1.1.2     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

candy_data <- read.csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/candy-power-ranking/candy-data.csv")
head(candy_data)

##   competitorname chocolate fruity caramel peanutyalmondy nougat
## 1      100 Grand         1      0       1              0      0
## 2   3 Musketeers         1      0       0              0      1
## 3       One dime         0      0       0              0      0
## 4    One quarter         0      0       0              0      0
## 5      Air Heads         0      1       0              0      0
## 6     Almond Joy         1      0       0              1      0
##   crispedricewafer hard bar pluribus sugarpercent pricepercent winpercent
## 1                1    0   1        0        0.732        0.860   66.97173
## 2                0    0   1        0        0.604        0.511   67.60294
## 3                0    0   0        0        0.011        0.116   32.26109
## 4                0    0   0        0        0.011        0.511   46.11650
## 5                0    0   0        0        0.906        0.511   52.34146
## 6                0    0   1        0        0.465        0.767   50.34755

Exercise One

My favorite type of candy is chocolate. Let’s subset the data to only look at the chocolate candy. Also, let’s rename the “competitorname” column to “candyname” to make it extra clear that we’re talking about candy. The data set doesn’t have any abbreviations so there’s no need to replace any data in the table.

chocolate_only <- subset(candy_data, chocolate==1)
head(chocolate_only)

##       competitorname chocolate fruity caramel peanutyalmondy nougat
## 1          100 Grand         1      0       1              0      0
## 2       3 Musketeers         1      0       0              0      1
## 6         Almond Joy         1      0       0              1      0
## 7          Baby Ruth         1      0       1              1      1
## 11   Charleston Chew         1      0       0              0      1
## 23 HersheyÃ•s Kisses         1      0       0              0      0
##    crispedricewafer hard bar pluribus sugarpercent pricepercent winpercent
## 1                 1    0   1        0        0.732        0.860   66.97173
## 2                 0    0   1        0        0.604        0.511   67.60294
## 6                 0    0   1        0        0.465        0.767   50.34755
## 7                 0    0   1        0        0.604        0.767   56.91455
## 11                0    0   1        0        0.604        0.511   38.97504
## 23                0    0   0        1        0.127        0.093   55.37545

names(chocolate_only)[names(chocolate_only) == "competitorname"] <- "candyname"
head(chocolate_only)

##            candyname chocolate fruity caramel peanutyalmondy nougat
## 1          100 Grand         1      0       1              0      0
## 2       3 Musketeers         1      0       0              0      1
## 6         Almond Joy         1      0       0              1      0
## 7          Baby Ruth         1      0       1              1      1
## 11   Charleston Chew         1      0       0              0      1
## 23 HersheyÃ•s Kisses         1      0       0              0      0
##    crispedricewafer hard bar pluribus sugarpercent pricepercent winpercent
## 1                 1    0   1        0        0.732        0.860   66.97173
## 2                 0    0   1        0        0.604        0.511   67.60294
## 6                 0    0   1        0        0.465        0.767   50.34755
## 7                 0    0   1        0        0.604        0.767   56.91455
## 11                0    0   1        0        0.604        0.511   38.97504
## 23                0    0   0        1        0.127        0.093   55.37545

Bonus Work

I really like chocolate, but how does everyone else feel about it? I’d like to compare the top ten performances for the overall population of Halloween candies with my subset of chocolate only candies by making two bar graphs.

winners_all <- candy_data %>% slice_max(winpercent, n=10)
ggplot(data=winners_all, aes(x=winpercent, y=competitorname, fill=winpercent)) + geom_bar(stat="identity")

winners_chocolate <- chocolate_only %>% slice_max(winpercent, n=10)
ggplot(data=winners_chocolate, aes(x=winpercent, y=candyname, fill=winpercent)) + geom_bar(stat="identity")

# Findings and Reccomendations The top ten winners overall are also the top ten winners in the chocolate-only division. Further analysis should be conducted to determine whether the same factors influence general winners as chocolate-only winners. For example, would having nougat carry more weight in the general population, or would it carry more weight in the chocolate only population? To accomplish this, a method such as partition analysis could be performed to figure out which variables are the driving forces behind the winners of each population.

607 Homework One

Carlisle Ferguson

2/4/2021

Introduction

Exercise One

Bonus Work