For a video walkthrough, see https://www.youtube.com/watch?v=C3MlpiTwoik.

Before you do the exercise below, be sure to do the section on waffle plots in the datacamp best practices course.

As usual, you will need to replace the load command below with something appropriate to your environment.

library(tidyverse)
library(waffle)
load("~/Dropbox/RProjects/Module 8/cdc.Rdata")

Exercise

Produce a waffle plot of the genhlth variable in the cdc dataframe. Please try to do this before you read my solution below.

# First get the raw counts
cdc %>% group_by(genhlth) %>% 
  summarize(count = n())    -> ghc
head(ghc)
## # A tibble: 5 x 2
##   genhlth   count
##   <fct>     <int>
## 1 excellent  4657
## 2 very good  6972
## 3 good       5675
## 4 fair       2019
## 5 poor        677
# Scale the counts to produce a reasonable number of blocks for the plot
parts = ghc$count/100

# Attach the categorical variable values as the names of the scaled vector of counts. 
names(parts) = ghc$genhlth

# Check the results
parts = round(parts,0)
parts
## excellent very good      good      fair      poor 
##        47        70        57        20         7
sum(parts)
## [1] 201
# Do the plot.
waffle(parts) + ggtitle("Self-Reported Health Status")