Data607 HW1

This is a further exploration of the halloween candy dataset taken from the fivethirtyeight article "The Ultimate Halloween Candy Power Ranking"

link to The Ulitmate Halloween Candy Power Ranking

In this article they explored the desirabiltiy of 85 common Halloween treats

They focused on properties of the treats to see if they could identify what made them desirable

Chocolate
Fruity
Caramel
Nuts
Nougat
Crispy
Hard

#Read in the data
file_path <- "https://raw.githubusercontent.com/catfoodlover/Data607/main/Data607_HW1_Data_WilliamAiken.csv"

candy_df <- read_csv(file_path, show_col_types = FALSE)

#Select columns of interest and fix names when necessary
candy_df <- candy_df %>% select(names = competitorname, choco = chocolate, fruity, caramel, nuts = peanutyalmondy, nougat, crispy = crispedricewafer, win_percent = winpercent)

#Character needs to be fixed
candy_df$names <- gsub( "Õ", "'", candy_df$names)

#Convert dataset from wide to long to make it easier to work with
temp <- melt(data = candy_df, id.vars = "names", measure.vars = c("choco", "fruity", "caramel", "nuts", "nougat", "crispy"), variable.name = "Property", value.name = "Status")

#Join win percentage back in
temp2 <- left_join(temp, candy_df %>% select(names, win_percent), by = "names")

What is the mean win percentage of all properties?

People like texture, nuts and crispy are stand outs both those are both properties that tend to go with chocolate the 3rd place property.

temp2 %>% filter(Status == 1) %>% select(Property, win_percent) %>% tbl_summary(by = Property, 
  statistic = list(all_continuous() ~ "{mean} ({sd})"),
digits = all_continuous() ~ 1,
label = win_percent ~ "% Winner")

Characteristic	choco, N = 37¹	fruity, N = 38¹	caramel, N = 14¹	nuts, N = 14¹	nougat, N = 7¹	crispy, N = 7¹
% Winner	60.9 (12.8)	44.1 (10.3)	57.3 (16.2)	63.7 (16.4)	60.1 (13.8)	66.2 (10.7)
¹ Mean (SD)

what are the properties of the top bottom candies?

Looks like people love chocolate.

candy_df %>% arrange(desc(win_percent)) %>% slice_head(n = 10) %>% 
  count(choco, fruity, caramel, nuts, nougat, crispy) %>% kable() %>% kable_styling()

choco	caramel	nuts	nougat	crispy	n
1	0	0	0	1	1
1	0	1	0	0	6
1	1	0	0	1	1
1	1	0	1	0	1
1	1	1	1	0	1

What are the properties of the bottom ten treats?

There a less clear message. Fruity isn't popular but more importantly none of these are chocolate.

candy_df %>% arrange(desc(win_percent)) %>% slice_tail(n = 10) %>% 
  count(choco, fruity, caramel, nuts, nougat, crispy) %>% kable() %>% kable_styling()

fruity	caramel	nuts	n
0	0	0	3
0	0	1	1
0	1	0	2
1	0	0	4

Data607 HW1

William Aiken

8/29/2021

Overview

This is a further exploration of the halloween candy dataset taken from the fivethirtyeight article "The Ultimate Halloween Candy Power Ranking"

In this article they explored the desirabiltiy of 85 common Halloween treats

They focused on properties of the treats to see if they could identify what made them desirable

What is the mean win percentage of all properties?

what are the properties of the top bottom candies?

What are the properties of the bottom ten treats?

Conclusion

Future questions: