checking_baseline

Load in data

Which baseline is best for data quality?

Compare raw baselines to processed baselines.

250 ms of delay, processed pupil with a 50% cutoff has this many bad baselines: 19 
250 ms of delay, raw pupil with a 50% cutoff has this many bad baselines: 22 

There are more NAs after preprocessing than before, implying that there are uninterpolated blinks occurring. Let’s look at the traces.

It looks like when we smooth and interpolate, indeed we are losing chunks of traces here as the raw pupil has more lines (but of course more obvious blinks). Let’s revisit the list baselines.

500 ms of list, processed pupil with a 50% cutoff has this many bad baselines: 8 
500 ms of list, raw pupil with a 50% cutoff has this many bad baselines: 4 

We also can look at the maintenance delay baseline after we only do one round of preprocessing instead of two (so this is smoothing, blink correction, interpolating and then that’s it.)

250 ms of delay, processed one-time-only pupil has this many bad baselines: 37 

It looks like there are two more bad trips if we only do one round of preprocessing. The way the Hann smoother works could account for more opportunities to interpolate the second time around, leading to less gaps after the full two rounds of preprocessing.

What if we use 500 ms of the maintenance phase?

500 ms of delay, processed pupil with a 50% cutoff has this many bad baselines: 13 

I never had the 50% cutoff before, so let’s look at the baselines if we don’t have a minimum cutoff.

500 ms of delay, processed pupil with NO cutoff has this many bad baselines: 4 
250 ms of delay, processed pupil with NO cutoff has this many bad baselines: 11 

Below is stuff to fill in the tables from here.

# 250 list, processed
baselines_list_250 <- calc_baseline_step %>%
  filter(stage == "list") %>%
  group_by(participant, condsFile, trial, adjustTrip, stage) %>%
  mutate(
    non_na_count = sum(!is.na(smoothed_pupil3[1:min(15, n())])),
    baseline = ifelse(non_na_count >= 8, median(smoothed_pupil3[1:min(15, n())], na.rm = TRUE), NA_real_)
  ) %>%
  select(-non_na_count) %>%  # Remove the helper column
  ungroup()

# get the NA count
baselines_list_250_NA <- baselines_list_250 %>%
  filter(is.na(baseline)) %>%
  distinct(participant, condsFile, trial, adjustTrip) %>%
  filter(adjustTrip == 1)

# 250 list, raw
baselines_list_250R <- calc_baseline_step %>%
  filter(stage == "list") %>%
  group_by(participant, condsFile, trial, adjustTrip, stage) %>%
  mutate(
    non_na_count = sum(!is.na(mean_pupil[1:min(15, n())])),
    baseline = ifelse(non_na_count >= 8, median(mean_pupil[1:min(15, n())], na.rm = TRUE), NA_real_)
  ) %>%
  select(-non_na_count) %>%  # Remove the helper column
  ungroup()

#get the NA count
baselines_list_250_NAR <- baselines_list_250R %>%
  filter(is.na(baseline)) %>%
  distinct(participant, condsFile, trial, adjustTrip) %>%
  filter(adjustTrip == 1)

# 500 delay, raw
baselines_m_500R <- calc_baseline_step %>%
  filter(stage == "maintenance_delay") %>%
  group_by(participant, condsFile, trial, adjustTrip, stage) %>%
  mutate(
    non_na_count = sum(!is.na(mean_pupil[1:min(30, n())])),
    baseline = ifelse(non_na_count >= 15, median(mean_pupil[1:min(30, n())], na.rm = TRUE), NA_real_)
  ) %>%
  select(-non_na_count) %>%  # Remove the helper column
  ungroup()

# get the NA count
baselines_m_500R_NA <- baselines_m_500R %>%
  filter(is.na(baseline)) %>%
  distinct(participant, condsFile, trial, adjustTrip) %>%
  filter(adjustTrip == 1)

# no cutoff, 250 list
baselines_250_all <- calc_baseline_step %>%
  filter(stage == "list") %>%
  group_by(participant, condsFile, trial, adjustTrip, stage) %>%
  mutate(baseline = median(smoothed_pupil3[1:min(15, n())], na.rm = TRUE)) %>%
  ungroup()

baselines_NA_250_all <- baselines_250_all %>%
  filter(is.na(baseline)) %>%
  distinct(participant, condsFile, trial, adjustTrip) %>%
  filter(adjustTrip == 1)

# no cutoff 500 list
baselines_500_all <- calc_baseline_step %>%
  filter(stage == "list") %>%
  group_by(participant, condsFile, trial, adjustTrip, stage) %>%
  mutate(baseline = median(smoothed_pupil3[1:min(30, n())], na.rm = TRUE)) %>%
  ungroup()

baselines_500_allNA <- baselines_500_all %>%
  filter(is.na(baseline)) %>%
  distinct(participant, condsFile, trial, adjustTrip) %>%
  filter(adjustTrip == 1)

Okay now I’m going to look at individual traces of some of these problem trips.

# Ss 1, trial 2, trip 1
# trace of maintenance delay, raw and processed
ss_1 <- calc_baseline_step %>%
  filter(participant == "pilotv2001" & 
           condsFile == "maintenance_delay" &
           trial == 2 &
           adjustTrip == 1 &
           stage == "maintenance_delay")

ss_1_3 <- calc_baseline_step %>%
  filter(participant == "pilotv2001" & 
           condsFile == "maintenance_delay" &
           trial == 3 &
           adjustTrip == 1 &
           stage == "maintenance_delay")

ss_5 <- calc_baseline_step %>%
  filter(participant == "pilotv2005" & 
           condsFile == "maintenance_delay" &
           trial == 3 &
           adjustTrip == 1 &
           stage == "maintenance_delay")

ss_29 <- calc_baseline_step %>%
  filter(participant == "pilotv2029" & 
           condsFile == "maintenance_delay" &
           trial == 2 &
           adjustTrip == 1 &
           stage == "maintenance_delay")

ss_15 <- calc_baseline_step %>%
  filter(participant == "pilotv2015" & 
           condsFile == "maintenance_delay" &
           trial == 1 &
           adjustTrip == 1 &
           stage == "maintenance_delay")

ss_19 <- calc_baseline_step %>%
  filter(participant == "pilotv2019" & 
           condsFile == "maintenance_delay" &
           trial == 1 &
           adjustTrip == 1 &
           stage == "maintenance_delay")

p1 <- ggplot(ss_1, aes(x = milliseconds, y = smoothed_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "Smoothed pupil", title = "P1, trial 2") +
  theme_minimal()

p1b <- ggplot(ss_1_3, aes(x = milliseconds, y = smoothed_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "Smoothed pupil", title = "P1, trial 3") +
  theme_minimal()

p5 <- ggplot(ss_5, aes(x = milliseconds, y = smoothed_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "Smoothed pupil", title = "P5, trial 3") +
  theme_minimal()

p29 <- ggplot(ss_29, aes(x = milliseconds, y = smoothed_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "Smoothed pupil", title = "P29, trial 2") +
  theme_minimal()

p15 <- ggplot(ss_15, aes(x = milliseconds, y = smoothed_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "Smoothed pupil", title = "P15, trial 1") +
  theme_minimal()

p19 <- ggplot(ss_19, aes(x = milliseconds, y = smoothed_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "Smoothed pupil", title = "P19, trial 1") +
  theme_minimal()

p1 + p1b
Warning: Removed 26 rows containing missing values or values outside the scale range
(`geom_line()`).

p5 + p29
Warning: Removed 23 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 18 rows containing missing values or values outside the scale range
(`geom_line()`).

p15 + p19
Warning: Removed 111 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 3 rows containing missing values or values outside the scale range
(`geom_line()`).

Do the same for the raw pupil:

p1r <- ggplot(ss_1, aes(x = milliseconds, y = mean_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "raw pupil", title = "P1, trial 2") +
  theme_minimal()

p1br <- ggplot(ss_1_3, aes(x = milliseconds, y = mean_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "raw pupil", title = "P1, trial 3") +
  theme_minimal()

p5r <- ggplot(ss_5, aes(x = milliseconds, y = mean_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "raw pupil", title = "P5, trial 3") +
  theme_minimal()

p29r <- ggplot(ss_29, aes(x = milliseconds, y = mean_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "raw pupil", title = "P29, trial 2") +
  theme_minimal()

p15r <- ggplot(ss_15, aes(x = milliseconds, y = mean_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "raw pupil", title = "P15, trial 1") +
  theme_minimal()

p19r <- ggplot(ss_19, aes(x = milliseconds, y = mean_pupil)) +
  geom_line() +
  labs(x = "Time (ms)", y = "raw pupil", title = "P19, trial 1") +
  theme_minimal()

p1r + p1br

p5r + p29r
Warning: Removed 9 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 11 rows containing missing values or values outside the scale range
(`geom_line()`).

p15r + p19r
Warning: Removed 100 rows containing missing values or values outside the scale range
(`geom_line()`).

Below is old stuff!

500 ms list phase trace VS 115 ms maintenance phase trace

First 200ms of list

Valid trials with 200ms of list (no 50% cutoff)

[1] "Below is a table of the trials that did not get quality baselines"
# A tibble: 10 × 6
   participant condsFile         trial adjustTrip stage.x baseline
   <chr>       <chr>             <dbl>      <dbl> <chr>      <dbl>
 1 pilotv2005  maintenance_delay     4          1 list          NA
 2 pilotv2008  maintenance_delay     3          1 list          NA
 3 pilotv2012  maintenance_delay     3          1 list          NA
 4 pilotv2012  maintenance_delay     4          1 list          NA
 5 pilotv2015  maintenance_delay     2          1 list          NA
 6 pilotv2015  maintenance_delay     4          1 list          NA
 7 pilotv2019  maintenance_delay     2          1 list          NA
 8 pilotv2026  maintenance_delay     1          1 list          NA
 9 pilotv2026  maintenance_delay     2          1 list          NA
10 pilotv2030  maintenance_delay     1          1 list          NA

Traces of list phase

The baseline matters….

Below reproduce the plot I made in the slides that looked like the pupil was dilating during the list phase BUT notice how I used a baseline that was just right at that end-of-list point, and if we look at the above graphs… yes the pupil begins to go up in the end, but the whole list trace itself is a mess of noise.