Data Viz Demo

knitr::opts_chunk$set(echo = T, warning = F, message = F, results = F)
library(tidyverse)

sj_data <- read.csv("./data/sj_data_by_trail_dsq.csv", row.names = "X") %>% 
  filter(!is.na(d_sq)) %>% 
  group_by(subject_id, condition, taskTime) %>%
  summarize(invalid_responses = sum(d_sq < 0))

cl_data <- read.csv("./data/cl_data_by_trial.csv", row.names = "X")

A study has two between-participants conditions: Massed and Dsitributed. Both groups of participants completed a similarity judgment task before and after learning. The following plot depicts the number of invalid responses that were produced by participants at each time point.

sj_data %>% 
  ggplot(., aes(x = taskTime, y = invalid_responses, color = condition)) +
  geom_boxplot() +
  labs(
    x = "Time Point",
    y = "Invalid Responses",
    color = "Learning Condition"
  )

Having time point on the X-axis makes sense, but pre-learning should probably come before post-learning. Let’s fix this.

sj_data %>% 
  mutate(taskTime = fct_relevel(as.factor(taskTime), levels = c("pre", "post"))) %>%
  ggplot(., aes(x = taskTime, y = invalid_responses, color = condition)) +
  geom_boxplot() +
  labs(
    x = "Time Point",
    y = "Invalid Responses",
    color = "Learning Condition"
  )

Here, fct_relevel adjusts the order in which the taskTime variable appears in plots and tables.

Next, I want the x-ticks to say “Pre-Learning” and “Post-Learning” instead of just “pre” and “post.”

sj_data %>% 
  mutate(taskTime = paste0(str_to_title(taskTime), "-Learning")) %>%
  mutate(taskTime = fct_relevel(as.factor(taskTime), levels = c("Pre-Learning", "Post-Learning"))) %>%
  ggplot(., aes(x = taskTime, y = invalid_responses, color = condition)) +
  geom_boxplot() +
  labs(
    x = "Time Point",
    y = "Invalid Responses",
    color = "Learning Condition"
  )

In the first mutate statement, there are two important functions to note. paste0 glues together all of its arguments into one string. For instance, paste0(“pre”, “-learning”) will return the string “pre-learning.” str_to_title makes strings Title Case. Look into the stringr package (part of base R) for more, similar functions you might find useful.

Next, I want to implement a colorblind-friendly color palette. You can find pre-made palettes or make your own at https://davidmathlogic.com/colorblind/.

sj_data %>% 
  mutate(taskTime = paste0(str_to_title(taskTime), "-Learning")) %>%
  mutate(taskTime = fct_relevel(as.factor(taskTime), levels = c("Pre-Learning", "Post-Learning"))) %>%
  ggplot(., aes(x = taskTime, y = invalid_responses, color = condition)) +
  geom_boxplot() +
  labs(
    x = "Time Point",
    y = "Invalid Responses",
    color = "Learning Condition"
  ) +
  scale_colour_manual(values = c("#CC0750", "#4885F9"))

Lastly, I want to make this theme look nicer.

sj_data %>% 
  mutate(taskTime = paste0(str_to_title(taskTime), "-Learning")) %>%
  mutate(taskTime = fct_relevel(as.factor(taskTime), levels = c("Pre-Learning", "Post-Learning"))) %>%
  ggplot(., aes(x = taskTime, y = invalid_responses, color = condition)) +
  geom_boxplot() +
  labs(
    x = "Time Point",
    y = "Invalid Responses",
    color = "Learning Condition"
  ) +
  scale_colour_manual(values = c("#CC0750", "#4885F9")) +
  theme(
    panel.background = element_blank(),
    axis.line = element_line(size = .5),
    legend.position = c(.9, .9),
    legend.justification = c("right","top"),
    legend.key = element_blank(),
    text = element_text(family = "sans", size = 12),
    axis.text = element_text(family = "sans", size = 12, color = "black")
)

This theme looks a lot nicer, doesn’t it? Maybe I want to use this theme for every plot I produce from now on. Instead of copying the same + theme() at the end of each plot, I can set my default theme using theme_update.

theme_update(
  panel.background = element_blank(),
  axis.line = element_line(size = .5),
  legend.position = c(.9, .9),
  legend.justification = c("right","top"),
  legend.key = element_blank(),
  text = element_text(family = "sans", size = 12),
  axis.text = element_text(family = "sans", size = 12, color = "black")
)

theme_update only needs to be used once to affect each subsequent plot in your markdown.

sj_data %>%
  mutate(taskTime = paste0(str_to_title(taskTime), "-Learning")) %>%
  mutate(taskTime = fct_relevel(as.factor(taskTime), levels = c("Pre-Learning", "Post-Learning"))) %>%
  ggplot(., aes(x = taskTime, y = invalid_responses, color = condition)) +
  geom_boxplot() +
  labs(
    x = "Time Point",
    y = "Invalid Responses",
    color = "Learning Condition"
  ) +
  scale_colour_manual(values = c("#CC0750", "#4885F9"))

cl_data %>%
  ggplot(aes(x = trial, y = rt, color = condition)) +
  stat_summary(geom = "line", fun = median) +
  labs(
    x = "Trial Number",
    y = "Median Reaction Time (ms)",
    color = "Learning Condition"
  ) +
  scale_colour_manual(values = c("#CC0750", "#4885F9"))

Data Viz Demo

Anthony Cruz

2022-10-28