For a video walkthrough, see https://www.youtube.com/watch?v=C3MlpiTwoik.
Before you do the exercise below, be sure to do the section on waffle plots in the datacamp best practices course.
As usual, you will need to replace the load command below with something appropriate to your environment.
library(tidyverse)
library(waffle)
load("~/Dropbox/RProjects/Module 8/cdc.Rdata")
Produce a waffle plot of the genhlth variable in the cdc dataframe. Please try to do this before you read my solution below.
# First get the raw counts
cdc %>% group_by(genhlth) %>%
summarize(count = n()) -> ghc
head(ghc)
## # A tibble: 5 x 2
## genhlth count
## <fct> <int>
## 1 excellent 4657
## 2 very good 6972
## 3 good 5675
## 4 fair 2019
## 5 poor 677
# Scale the counts to produce a reasonable number of blocks for the plot
parts = ghc$count/100
# Attach the categorical variable values as the names of the scaled vector of counts.
names(parts) = ghc$genhlth
# Check the results
parts = round(parts,0)
parts
## excellent very good good fair poor
## 47 70 57 20 7
sum(parts)
## [1] 201
# Do the plot.
waffle(parts) + ggtitle("Self-Reported Health Status")