Thanksgiving is just around the corner. Let’s celebrate the best way we know how, by analyzing some festive data! I hope you have fun making graphics! Happy Thanksgiving!
We are using the data from fivethirtyeight the Here’s What Your Part of America Eats on Thanksgiving article https://fivethirtyeight.com/features/heres-what-your-part-of-america-eats-on-thanksgiving/.
The original data set can be found here: https://raw.githubusercontent.com/fivethirtyeight/data/master/thanksgiving-2015/thanksgiving-2015-poll-data.csv
The variables for the survey are described here:https://github.com/fivethirtyeight/data/tree/master/thanksgiving-2015
I cleaned up the data, selected a subset of variables, and created binary variables. My dataset can be found here: https://raw.githubusercontent.com/kitadasmalley/FA2020_DataViz/main/data/useThanks.csv
useThanks<-read.csv("https://raw.githubusercontent.com/kitadasmalley/FA2020_DataViz/main/data/useThanks.csv",
header=TRUE)
str(useThanks)
## 'data.frame': 1058 obs. of 83 variables:
## $ id : num 4.34e+09 4.34e+09 4.34e+09 4.34e+09 4.34e+09 ...
## $ celebrate : Factor w/ 2 levels "No","Yes": 2 2 2 2 2 2 2 2 2 2 ...
## $ main : Factor w/ 9 levels "","Chicken","Ham/Pork",..: 9 9 9 9 7 9 9 9 9 5 ...
## $ cooked : Factor w/ 6 levels "","Baked","Fried",..: 2 2 6 2 2 6 2 2 6 2 ...
## $ stuffing : Factor w/ 5 levels "","Bread-based",..: 2 2 5 2 2 5 2 5 2 2 ...
## $ cranberry : Factor w/ 5 levels "","Canned","Homemade",..: 4 5 3 3 2 3 2 3 2 5 ...
## $ gravy : Factor w/ 3 levels "","No","Yes": 3 3 3 3 3 3 3 3 3 3 ...
## $ brussel.sprouts : Factor w/ 2 levels "","Brussel sprouts": 1 1 2 2 2 2 1 1 2 2 ...
## $ carrots : Factor w/ 2 levels "","Carrots": 2 1 2 1 1 2 1 2 1 2 ...
## $ cauliflower : Factor w/ 2 levels "","Cauliflower": 1 1 2 1 1 2 1 1 1 1 ...
## $ corn : Factor w/ 2 levels "","Corn": 1 2 2 1 1 2 1 1 2 1 ...
## $ cornbread : Factor w/ 2 levels "","Cornbread": 1 1 2 2 2 2 1 1 2 1 ...
## $ fruit.salad : Factor w/ 2 levels "","Fruit salad": 1 1 1 1 1 2 2 1 1 1 ...
## $ green.beans : Factor w/ 2 levels "","Green beans/green bean casserole": 2 2 1 1 1 2 2 1 2 2 ...
## $ mac.n.cheese : Factor w/ 2 levels "","Macaroni and cheese": 2 2 1 1 1 2 1 1 1 1 ...
## $ mashed.potatoes : Factor w/ 2 levels "","Mashed potatoes": 2 2 2 2 2 2 2 1 2 2 ...
## $ rolls : Factor w/ 2 levels "","Rolls/biscuits": 1 2 2 2 2 2 2 1 2 2 ...
## $ squash : Factor w/ 2 levels "","Squash": 1 1 1 1 2 2 1 1 2 1 ...
## $ salad : Factor w/ 2 levels "","Vegetable salad": 1 2 2 2 2 2 1 1 1 1 ...
## $ yams.sweet.potato : Factor w/ 2 levels "","Yams/sweet potato casserole": 2 2 1 2 2 2 2 1 1 2 ...
## $ apple.pie : Factor w/ 2 levels "","Apple": 2 2 2 1 2 1 2 1 2 1 ...
## $ buttermilk.pie : Factor w/ 2 levels "","Buttermilk": 1 1 1 1 1 1 1 1 2 2 ...
## $ cherry.pie : Factor w/ 2 levels "","Cherry": 1 1 2 1 1 1 1 1 1 1 ...
## $ chocolate.pie : Factor w/ 2 levels "","Chocolate": 1 2 1 1 1 1 1 2 1 1 ...
## $ coconut.pie : Factor w/ 2 levels "","Coconut cream": 1 1 1 1 1 1 1 1 1 1 ...
## $ keylime.pie : Factor w/ 2 levels "","Key lime": 1 1 1 1 1 1 1 1 1 1 ...
## $ peach.pie : Factor w/ 2 levels "","Peach": 1 1 2 1 1 1 1 1 1 1 ...
## $ pecan.pie : Factor w/ 2 levels "","Pecan": 1 1 2 2 1 1 1 1 1 1 ...
## $ pumpkin.pie : Factor w/ 2 levels "","Pumpkin": 1 2 2 2 2 1 2 1 2 2 ...
## $ sweet.potato.pie : Factor w/ 2 levels "","Sweet Potato": 1 1 2 1 1 2 1 1 2 2 ...
## $ apple.cobbler : Factor w/ 2 levels "","Apple cobbler": 1 1 1 1 1 1 1 1 1 1 ...
## $ blondies : Factor w/ 2 levels "","Blondies": 1 1 1 1 1 1 1 1 1 1 ...
## $ brownies : Factor w/ 2 levels "","Brownies": 1 1 2 1 1 1 1 1 1 1 ...
## $ carrot.cake : Factor w/ 2 levels "","Carrot cake": 1 1 2 1 1 1 1 1 1 1 ...
## $ cheesecake : Factor w/ 2 levels "","Cheesecake": 2 2 1 1 1 2 1 1 1 1 ...
## $ cookies : Factor w/ 2 levels "","Cookies": 2 2 2 1 1 1 2 2 2 1 ...
## $ fudge : Factor w/ 2 levels "","Fudge": 1 1 2 1 1 1 1 1 1 1 ...
## $ ice.cream : Factor w/ 2 levels "","Ice cream": 2 1 2 1 1 1 1 1 1 1 ...
## $ peach.cobbler : Factor w/ 2 levels "","Peach cobbler": 1 1 1 1 1 1 1 1 1 1 ...
## $ pray : Factor w/ 3 levels "","No","Yes": 3 3 3 2 2 3 2 2 2 3 ...
## $ friendsgiving : Factor w/ 3 levels "","No","Yes": 2 2 3 2 2 3 2 3 2 2 ...
## $ black.friday : Factor w/ 3 levels "","No","Yes": 2 3 3 2 2 3 3 3 2 2 ...
## $ area.live : Factor w/ 4 levels "","Rural","Suburban",..: 3 2 3 4 4 4 2 2 4 3 ...
## $ age : Factor w/ 5 levels "","18 - 29","30 - 44",..: 2 2 2 3 3 2 2 2 3 3 ...
## $ gender : Factor w/ 3 levels "","Female","Male": 3 2 3 3 3 3 3 3 3 3 ...
## $ income : Factor w/ 12 levels "","$0 to $9,999",..: 11 10 2 8 4 2 9 12 11 9 ...
## $ DivName : Factor w/ 10 levels "","East North Central",..: 4 3 5 7 7 7 2 5 4 3 ...
## $ celebrate01 : int 1 1 1 1 1 1 1 1 1 1 ...
## $ gravy01 : int 1 1 1 1 1 1 1 1 1 1 ...
## $ friendsgiving01 : int 0 0 1 0 0 1 0 1 0 0 ...
## $ black.friday01 : int 0 1 1 0 0 1 1 1 0 0 ...
## $ brussel.sprouts01 : int 0 0 1 1 1 1 0 0 1 1 ...
## $ carrots01 : int 1 0 1 0 0 1 0 1 0 1 ...
## $ cauliflower01 : int 0 0 1 0 0 1 0 0 0 0 ...
## $ corn01 : int 0 1 1 0 0 1 0 0 1 0 ...
## $ cornbread01 : int 0 0 1 1 1 1 0 0 1 0 ...
## $ fruit.salad01 : int 0 0 0 0 0 1 1 0 0 0 ...
## $ green.beans01 : int 1 1 0 0 0 1 1 0 1 1 ...
## $ mac.n.cheese01 : int 1 1 0 0 0 1 0 0 0 0 ...
## $ mashed.potatoes01 : int 1 1 1 1 1 1 1 0 1 1 ...
## $ rolls01 : int 0 1 1 1 1 1 1 0 1 1 ...
## $ squash01 : int 0 0 0 0 1 1 0 0 1 0 ...
## $ salad01 : int 0 1 1 1 1 1 0 0 0 0 ...
## $ yams.sweet.potato01: int 1 1 0 1 1 1 1 0 0 1 ...
## $ apple.pie01 : int 1 1 1 0 1 0 1 0 1 0 ...
## $ buttermilk.pie01 : int 0 0 0 0 0 0 0 0 1 1 ...
## $ cherry.pie01 : int 0 0 1 0 0 0 0 0 0 0 ...
## $ chocolate.pie01 : int 0 1 0 0 0 0 0 1 0 0 ...
## $ coconut.pie01 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ keylime.pie01 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ peach.pie01 : int 0 0 1 0 0 0 0 0 0 0 ...
## $ pecan.pie01 : int 0 0 1 1 0 0 0 0 0 0 ...
## $ pumpkin.pie01 : int 0 1 1 1 1 0 1 0 1 1 ...
## $ sweet.potato.pie01 : int 0 0 1 0 0 1 0 0 1 1 ...
## $ apple.cobbler01 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ blondies01 : int 0 0 0 0 0 0 0 0 0 0 ...
## $ brownies01 : int 0 0 1 0 0 0 0 0 0 0 ...
## $ carrot.cake01 : int 0 0 1 0 0 0 0 0 0 0 ...
## $ cheesecake01 : int 1 1 0 0 0 1 0 0 0 0 ...
## $ cookies01 : int 1 1 1 0 0 0 1 1 1 0 ...
## $ fudge01 : int 0 0 1 0 0 0 0 0 0 0 ...
## $ ice.cream01 : int 1 0 1 0 0 0 0 0 0 0 ...
## $ peach.cobbler01 : int 0 0 0 0 0 0 0 0 0 0 ...
This is pretty tough and requires a fair bit of data wrangling. Here are a few hints to help you along the way.
In order to assess what dishes are served “disproportionately” by region, we first need to understand national trends. Thus, we must calculate national values as weighted averages by population distribution in regions. This data comes from https://www.hcup-us.ahrq.gov/figures/nis_figure1_2018.jsp
Here is some data about how people are distributed accross regions:
popDiv<-data.frame(DivName=c("East North Central",
"East South Central",
"Middle Atlantic",
"Mountain",
"New England",
"Pacific",
"South Atlantic",
"West North Central",
"West South Central"),
pop=c(46798649,
18931477,
41601787,
23811346,
14757573,
52833604,
63991523,
21179519,
39500457))%>%
mutate(popProp=pop/323405935)
popDiv
## DivName pop popProp
## 1 East North Central 46798649 0.14470560
## 2 East South Central 18931477 0.05853782
## 3 Middle Atlantic 41601787 0.12863644
## 4 Mountain 23811346 0.07362681
## 5 New England 14757573 0.04563173
## 6 Pacific 52833604 0.16336622
## 7 South Atlantic 63991523 0.19786750
## 8 West North Central 21179519 0.06548896
## 9 West South Central 39500457 0.12213894
## Joining, by = "DivName"
favorites<-divPie2%>%
select(DivName, favFlavor, favSide)%>%
mutate(DivName=paste(DivName, " Division", sep=""))
favorites
## # A tibble: 10 x 3
## DivName favFlavor favSide
## <chr> <chr> <chr>
## 1 " Division" Coconut Salad
## 2 "East North Central Division" Pumpkin Rolls
## 3 "East South Central Division" Pecan Mac N Cheese
## 4 "Middle Atlantic Division" Apple Squash
## 5 "Mountain Division" Pecan Salad
## 6 "New England Division" Apple Squash
## 7 "Pacific Division" Cherry Salad
## 8 "South Atlantic Division" Sweet Potato Mac N Cheese
## 9 "West North Central Division" Pumpkin Green Beans
## 10 "West South Central Division" Pecan Cornbread
#install.packages("usmap")
library(usmap)
## Warning: package 'usmap' was built under R version 3.6.2
states <- usmap::us_map()
fips<-read.csv("https://raw.githubusercontent.com/kitadasmalley/FA2020_DataViz/main/data/stateFIPS.csv",
header=TRUE)
geoPie<-fips%>%
left_join(favorites)
## Joining, by = "DivName"
foodStates<-states %>%
mutate(Name=full)%>%
left_join(geoPie)
## Joining, by = "Name"
geom_polygon
This graphic is not complete yet! Thats up to you.