Reference Graph

Orginal Graph

Link to the original graph: https://fivethirtyeight.com/features/how-americans-like-their-steak/

Link to the data (downloaded from github): https://github.com/fivethirtyeight/data/tree/master/steak-survey

The original graph had a clean design with minimal extra color. The clever color scheme was that each bar was the color of a steak’s inside color at the level of cooking for that category. It also had a clear structure descending in done-ness. It utilized space effectively and had easily readable text and numbers. Some weaknesses were that the color scheme was not color blind friendly; it had inconsistent percentage leveling, as only the 8 had a percent symbol; and there were also no scale or grid lines, which could help a reader interpret the data.

library(ggplot2)
library(dplyr)

steak_data <- read.csv("steak-risk-survey.csv")

# Clean/ process the data
steak_counts_df <- steak_data %>%
  count(`How.do.you.like.your.steak.prepared.`) %>%
  # Only keep valid steak doneness levels and standardize names
  filter(`How.do.you.like.your.steak.prepared.` %in% 
           c("Medium", "Medium Well", "Medium rare", "Rare", "Well")) %>%
  # Standardize the names to match the graph
  mutate(
    Preference = case_when(
      `How.do.you.like.your.steak.prepared.` == "Medium Well" ~ "Medium-well",
      `How.do.you.like.your.steak.prepared.` == "Medium rare" ~ "Medium-rare",
      TRUE ~ `How.do.you.like.your.steak.prepared.`
    )
  ) %>%
  # Calculate percentages
  mutate(
    Percentage = round((n / sum(n)) * 100),
    # Create labels - only add % to the value 8
    Label = ifelse(Percentage == 8, 
      paste0(Percentage, "%"),
      Percentage)
  ) %>%
  #Correct ordering
  mutate(Preference = factor(Preference, 
    levels = c("Rare", "Medium-rare", "Medium", "Medium-well", "Well")))

# Create plot
ggplot(steak_counts_df, aes(x = Preference, y = Percentage)) +
  geom_bar(stat = "identity", 
  aes(fill = Preference),
  width = 0.8) +
  coord_flip() +
  scale_fill_manual(values = c(
    "Rare" = "#8B3A3A",
    "Medium-rare" = "#CD6E6E",
    "Medium" = "#8B5742",
    "Medium-well" = "#614126",
    "Well" = "#3A2012"
  )) +
  geom_text(aes(label = Label),
    hjust = -0.2, 
    size = 3.5) +
  labs(
    title = "'How Do You Like Your Steak Prepared?'",
    subtitle = "From a survey of 432 steak-eating Americans",
    caption = "SOURCE: SURVEYMONKEY",
    x = NULL,
    y = NULL
  ) +
  theme_minimal() +
  theme(
    plot.background = element_rect(fill = "#F0F0F0", color = "#F0F0F0"),
    plot.title = element_text(size = 14, face = "bold"),
    plot.subtitle = element_text(size = 10),
    plot.caption = element_text(size = 8),
    axis.text.y = element_text(size = 10),
    axis.text.x = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank(),
    legend.position = "none"
  ) +
  scale_y_continuous(limits = c(0, 45))

Improvements

I really liked the original graph’s theme, but I wanted to make it more accessible, so I switched to a colorblind-friendly palette while keeping the darker shades for more well-done steak, similar to the original. I also made the percentage labels more consistent, people can probably figure it out, but adding the % makes it clearer. The original website discussed how lottery choices correlated with steak doneness but ultimately showed that they didn’t, using a table. I thought it would be more intuitive as a graph, so I adjusting the data layer by adding facet_wrap() to break down responses by lottery choice. I chose to include non responses to the lottery question because I thought it was interesting. It’s not obvious what the lottery choices mean so I added a caption explaining lottery A and lottery B. Finally, I made coordinate changes, adding grid lines to make it easier to compare bar heights at a glance, since the original had none.

library(ggplot2)
library(dplyr)
library(viridis)  # For colorblind-friendly palette

# Rename long variable name
steak_data <- read.csv("steak-risk-survey.csv") %>%
  rename(lottery_answer = Consider.the.following.hypothetical.situations...br.In.Lottery.A..you.have.a.50..chance.of.success..with.a.payout.of..100...br.In.Lottery.B..you.have.a.90..chance.of.success..with.a.payout.of..20...br..br.Assuming.you.have..10.to.bet..would.you.play.Lottery.A.or.Lottery.B.)

# Replace "" values with "No Response"
steak_data <- steak_data %>%
  mutate(lottery_answer = ifelse(lottery_answer == "", "No Response", lottery_answer))

# Clean/process the data
steak_counts_df <- steak_data %>%
  count(lottery_answer, `How.do.you.like.your.steak.prepared.`) %>%
  # Only keep valid steak doneness levels and standardize names
  filter(`How.do.you.like.your.steak.prepared.` %in% 
           c("Medium", "Medium Well", "Medium rare", "Rare", "Well")) %>%
  # Standardize the names to match the graph
  mutate(
    Preference = case_when(
      `How.do.you.like.your.steak.prepared.` == "Medium Well" ~ "Medium-well",
      `How.do.you.like.your.steak.prepared.` == "Medium rare" ~ "Medium-rare",
      TRUE ~ `How.do.you.like.your.steak.prepared.`
    )
  ) %>%
  group_by(lottery_answer) %>%  # Group by Lottery A, B, or No Response
  # Calculate and add percentages
  mutate(
    Percentage = round((n / sum(n)) * 100),
    Label = paste0(Percentage, "%")
  ) %>%
  # Correct ordering
  mutate(Preference = factor(Preference, 
    levels = c("Rare", "Medium-rare", "Medium", "Medium-well", "Well")))

# Create plot
ggplot(steak_counts_df, aes(x = Preference, y = Percentage)) +
  geom_bar(stat = "identity", 
  aes(fill = Preference),
  width = 0.8) +
  coord_flip() +
  scale_fill_viridis(discrete = TRUE, option = "viridis", direction = -1) +
  geom_text(aes(label = Label),
    hjust = -0.2, 
    size = 3.5) +
  labs(
    title = "'How Do You Like Your Steak Prepared?' by Lottery Choice",
    subtitle = "Comparison between people who chose Lottery A, Lottery B, or did not respond",
    caption = "SOURCE: SURVEYMONKEY \n Consider the following hypothetical situations...\nIn Lottery A, you have a 50% chance of success, with a payout of $100...\nIn Lottery B, you have a 90% chance of success, with a payout of $20...\nAssuming you have $10 to bet, would you play Lottery A or Lottery B?" ,
    x = "Steak Doneness Preference",
    y = "Percentage Within Each Group"
  ) +
  theme_minimal() +
  theme(
    plot.background = element_rect(fill = "#F0F0F0", color = "#F0F0F0"),
    plot.title = element_text(size = 14, face = "bold"),
    plot.subtitle = element_text(size = 10),
    plot.caption = element_text(size = 8, hjust = 0),
    axis.text.y = element_text(size = 10),
    axis.text.x = element_text(size = 8),  # axis text
    axis.ticks = element_line(),  # axis ticks
    panel.grid.major.x = element_line(color = "gray80"),  # x-axis grid lines
    panel.grid.major.y = element_blank(),
    legend.position = "none"
  ) +
  scale_y_continuous(
    limits = c(0, 60),
    breaks = seq(0, 60, by = 10),
    labels = function(x) paste0(x, "%"),
    expand = expansion(mult = c(0, 0.1))
  ) +  
  facet_wrap(~lottery_answer)  # Facet by Lottery A, B, and No Response

project1-writeup

See the publicly published web version here: http://rpubs.com/jtvaughn/1264331

Orginal Graph

Improvements