This homework is due by Thursday, January 26th, 8:00pm. Upload a pdf file to Canvas called 2_wrangling.pdf

In this homework, you’ll combine data-wrangling and visualization to reverse-engineer a professional data visualization. The goal of this homework is to replicate Figure 1 as closely as possible. This is known as a “Copy the Master” assignment (Nolan and Perrett 2016). If you have any questions, feel free to come to section (Monday 3:30-4:30 Econ 139 or Tuesday 4:30-5:30pm Littlefield 103), sign up for OH (links on the course website), or post on Ed Discussion.

Dataset description

You’ll be working with a dataset collected in the high throughput psychophysics experimental tradition (Rajalingham et al. 2018). One variant of this approach is to collect data from hundreds, even thousands of participants, in order to generate reliable estimates of perceptual behaviors. Typically, this is accomplished using online data collection. In this dataset there were originally 345 subjects who collectively performed 42,113 trials of a visual discrimination task.

In order to generate the main plot, we’ll analyze a subset of this human behavior. When this is done, we’ll compare human performance with the behavior of neurons and a computational model performing the same task. The neural data is from a neuroscience/psychophysics experiment conducted in non-human primates (Majaj et al. 2015). The computational model data, in our case, was used to validate its correspondence with visual area IT. If you’re interested in this model class, (Yamins and DiCarlo 2016) provides a nice intro. The basic idea is that we can use biologically plausible computational models to understand the relationship between the brain and behavior.

These data were collected to generate a stimulus set with an interesting pattern of perceptual properties: 1. Early visual regions (e.g. primate area V4) can’t do the task at all 2. Late visual regions (e.g. primate aread IT) can do the task on some items 3. Neurotypical humans can do the task on all items (outperforming IT and V4)

The beautiful plot you want to recreate.

The beautiful plot you want to recreate.

Panel A compares model predictions (x axis) with the neural predictions from V4 (y axis). Panel B shows the model predictions (x axis) compared with the neural predictions from IT (y axis), showing that the model is a good approximation for this visual region. Panel C shows the model predictions (x axis) with the behavior of neurotypical humans (y axis). Humans clearly outperform the model (and are clearly better than IT or V4). Panel D is a summary plot, with just the trends from the first three. Panel E is the relationship between reaction time (x axis) and the difference between human performance and neural area IT. This demonstrates that the degree to which human accuracy diverges from visual area IT (but not V4), the more time is required to perform these tasks. Together these results (alongside recent lesion studies) suggest that other brain regions, beyond these canonical sensory regions, are implicated in these kinds of behaviors.

What kind of behavior? Formally, they’re called concurrent visual discrimination (“oddity”) tasks that vary in difficulty. The task is to identify on each trial which one out of three images is the odd one. Here is a screenshot of an example trial. In this case, the lion is the odd one out.

Example trial. Elephant is the `typical object` while lion is the `oddity object.` In the control trials (`V0`), all objects are presented from the same viewpoint. In the experimental trials (`V3`), all objects are presented from different viewpoints. This trial here is one of the `V3` elephant trials.

Example trial. Elephant is the typical object while lion is the oddity object. In the control trials (V0), all objects are presented from the same viewpoint. In the experimental trials (V3), all objects are presented from different viewpoints. This trial here is one of the V3 elephant trials.

You’ll be working with two dataframes in this homework. First, we’ll provide you with a data frame that contains the human behavior you’ll be wrangling and summarizing. Second, a dataframe that contains the accuracy on those same items, but from several different sources: neural responses from two visual regions (V4 and IT) as well as a computational model.

Read in and inspect dataset

Great. Let’s start by loading the human behavioral data frame:

df.human_wide = read_csv("data/human_behavior_V03.csv")

Let’s take a look at the first 40 column names:

df.human_wide %>% 
  names() %>% 
  head(40)
##  [1] "index"              "0-array_type"       "0-choice"          
##  [4] "0-chosen_object"    "0-collection"       "0-database"        
##  [7] "0-iteration"        "0-oddity_category"  "0-oddity_index"    
## [10] "0-oddity_name"      "0-oddity_type"      "0-reaction_time"   
## [13] "0-trial_indices"    "0-trial_number"     "0-typical_category"
## [16] "0-typical_name"     "0-variation_level"  "0-worker_id"       
## [19] "1-array_type"       "1-choice"           "1-chosen_object"   
## [22] "1-collection"       "1-database"         "1-iteration"       
## [25] "1-oddity_category"  "1-oddity_index"     "1-oddity_name"     
## [28] "1-oddity_type"      "1-reaction_time"    "1-trial_indices"   
## [31] "1-trial_number"     "1-typical_category" "1-typical_name"    
## [34] "1-variation_level"  "1-worker_id"        "10-array_type"     
## [37] "10-choice"          "10-chosen_object"   "10-collection"     
## [40] "10-database"

Oh, wow. That’s really untidy data: each row here is a subject, and each column contains 1) a prefix indicating which trial these data correspond to, and 2) a variable name indicating what information is present. So, for example, if we’re interested in the variable variation_level, and there are \(n\) trials, this means there are \(n\) columns containing variation_level :(

Let’s take a look at the content of a single trial, for example, trial 13:

Variables of interest we’ll be looking at:

Variables that you need to create:

Step 1: Data Wrangling

  1. Convert the data from wide to long format, such that each row corresponds to a single trial, preserving all the other useful information (like worker_id, typical_name, etc.). You might call this new data frame df.human_long. This may require two transformation steps. First, you’ll separate the current column headings into new columns: the trial and the variable name indicating what information present. Second, you’ll structure the dataset so that each row corresponds to a single trial, with the variable names indicating what information was present being columns.
df.human_wide2 <- df.human_wide %>% 
  pivot_longer(cols = -index,
               names_to = c("trial","variable"),
               names_sep = "-",
               values_to = "rating",
               values_transform = list(rating = as.character)) %>% 
  pivot_wider(names_from = "variable",
              values_from = "rating")
  1. Select only those columns that you need (as indicated in the list of variables of interest mentioned above). Also remove the index column, rename the worker_id column to participant and put it as the left-most column of the data frame.
df.human_wide3 <- df.human_wide2 %>% 
  select(index,oddity_name,chosen_object,typical_name,variation_level,reaction_time,worker_id) %>%
  select(-index) %>% 
  relocate(worker_id) %>% 
  rename(participant = worker_id)
  1. Remove all the rows in the long data frame that contain missing values (NA) in the participant column (i.e. the column that used to be the worker_id column before you renamed it in step 2).

Tip: The drop_na() function will be helpful here.

df.human_wide4 <- df.human_wide3 %>% 
  drop_na(participant)
  1. Clean up the data types by turning the <chr> columns that contain only numbers into numeric columns.
df.human_wide5 <- df.human_wide4 %>% 
  mutate(across(.cols = c(participant,reaction_time),
                .fns = ~ as.numeric(.)))
  1. Generate a correct column by checking whether the object a participant chose (chosen_object) matches the oddity (oddity_name). That is, you want to check whether they correctly identified the odd one out.
# 1 means the participant got the trial correct, 0 means they got it wrong.
df.human_wide6 <- df.human_wide5 %>% 
  mutate(correct = ifelse(chosen_object == oddity_name,1,0))
  1. Print the top 10 rows of your data frame that you’ve created in 5).
df.human_wide6 %>% 
  head(10)

Step 2: Analysis of human behavioral data

  1. Identify subjects who were more than 90% accurate on the control trials. Then, keep only those participants (who were more than 90% accurate on the control trials) in the data frame that you’ve created in 5).

Tip: Remember that the control trials are marked as “V0” in variation_level.

#Find the total number of trials for each participant, remove participants' V0 trials if they scored less than 90% overall on it, then find the new number of total trials for each participant and remove participants' V3 trials if their V0 trials were removed 
df.human_wide7 <- df.human_wide6 %>%
  group_by(participant) %>% 
  mutate(pre_n = n()) %>% 
  group_by(participant,variation_level) %>%
  mutate(n = n(),
         total_number_correct = sum(correct),
         percent_correct = (total_number_correct/n)*100) %>%
  filter((variation_level == "V0" & percent_correct > 90) | variation_level == "V3") %>%
  group_by(participant) %>% 
  mutate(post_n=n()) %>% 
  filter(pre_n == post_n) 
  1. Create a data frame that contains the average accuracy and reaction time for each item (as identified by typical_name). The estimates should be rounded to two decimal places.
df.human_wide8 <- df.human_wide7 %>%
  group_by(typical_name) %>% 
  summarize(n=n(),
            total_trials_accurate=sum(correct),
            human_accuracy = round(total_trials_accurate/n, digits=2),
            human_reaction_time = round(mean(reaction_time, na.rm=T), digits=2)) %>% 
  select(typical_name, human_accuracy, human_reaction_time)
  1. Print the top 10 rows of the data frame that you’ve created in 2).
df.human_wide8 %>% 
  head(10)

Step 3: Compare human and neural behavior

If all went well, you should have created the same data frame that is also saved in data/human_accuracy _and_reaction_time_per_item.csv. If you weren’t quite able to create this data frame, you can load it in the code chunk below and continue to work with it.

df.human_long = read_csv("data/human_accuracy_and_reaction_time_per_item.csv")

Now, let’s load the data frame that contains the neural and computational performance on those same typical_name items.

df.neural_and_model = read_csv("data/neural_and_computational_predictions_per_item.csv")
  1. Combine the human with the neural and computational data by joining the df.human_long data frame with the df.neural_and_model data frame.

Tip: Use the typical_name column to join the data frames.

df.human_wide9 <- df.human_wide8 %>% 
  full_join(df.neural_and_model,
            by = "typical_name")
  1. Create a new column in your data frame that contains the difference between human performance and (neural) IT performance on each item.
df.human_wide10 <- df.human_wide9 %>% 
  mutate(difference = human_accuracy-it)
  1. Print out the top 10 rows of the data frame that you’ve created in 2).
df.human_wide10 %>% 
  head(10)
  1. Plot the correspondence between neural and human behaviors (i.e. recreate the figure panel shown at the very top of the document). Try to tweak the aspects of your figure to make it match the one shown above (but don’t sweat it if you can’t get everything to perfectly match).

Tip: For panels A to D the model accuracy (in the column model) is shown on the x-axis, and what is shown on the y-axis differs between the panels (A: v4, B: it, C: human_accuracy). For D, instead of showing a scatter plot, use geom_smooth(method = "lm") to show the regression lines. For E, use a scatter plot to show the relationship between the difference in accuracy between human and it performance on the y-axis (which you’ve already calculated above), and reaction time on the x-axis.

To combine the different plots into one figure panel, take a look at the patchwork package. You can find an example for how to use that package in the notes for the Visualization 2 lecture (visualization2.Rmd).

Plot1 <- ggplot(df.human_wide10, aes(x=model, y=v4)) +
  geom_point(pch=21, size=2.5, stroke=0.7, color="black", fill="grey") +
  coord_cartesian(ylim=c(0.2,1),
                  xlim=c(0.2,1)) +
  ylab("V4 Performance") +
  xlab("Model Performance") +
  labs(title = "Model outperforms neural area V4") +
  geom_abline(lty="dashed",color="gray45") +
  theme_classic()

Plot2 <- ggplot(df.human_wide10, aes(x=model, y=it)) +
  geom_point(pch=21, size=2.5, stroke=0.7, color="black", fill="darkorange") +
  coord_cartesian(ylim=c(0.2,1),
                  xlim=c(0.2,1)) +
  ylab("IT Performance") +
  xlab("Model Performance") +
  labs(title = "Model Predicts neural area IT") +
  geom_abline(lty="dashed",color="gray45") +
  theme_classic()

Plot3 <- ggplot(df.human_wide10, aes(x=model, y=human_accuracy)) +
  geom_point(pch=21, size=2.5, stroke=0.7, color="white", fill="black") +
  coord_cartesian(ylim=c(0.2,1),
                  xlim=c(0.2,1)) +
  ylab("Human Performance") +
  xlab("Model Performance") +
  labs(title = "Humans outperform model") +
  geom_abline(lty="dashed",color="gray45") +
  theme_classic()

df.human_long11 <- df.human_wide10 %>% 
  pivot_longer(cols = c("human_accuracy","v4","it"),
               names_to = "plot_type",
               values_to = "performance")

Plot4 <- ggplot(df.human_long11, aes(x=model, y=performance, color=plot_type)) +
  geom_point(alpha=0, show.legend=F) +
  geom_smooth(method="lm", se=0, show.legend=F, lwd=1.5) +
  coord_cartesian(ylim=c(0.2,1),
                  xlim=c(0.2,1)) +
  ylab("Performance") +
  xlab("Model Performance") +
  labs(title = "Stimuli exhibit ideal properties") +
  geom_abline(lty="dashed",color="gray45") +
  scale_color_manual(values=c("black","darkorange","grey")) +
  theme_classic()

Plot5 <- ggplot(df.human_wide10, aes(x=human_reaction_time, y=difference)) +
  geom_point(pch=21, size=2.5, stroke=0.7, color="black", fill="darkorange") +
  coord_cartesian(ylim=c(0,0.6),
                  xlim=c(2000,4000)) +
  ylab("Human - Neural") +
  xlab("Reaction time") +
  labs(title = "Difference between humans and IT \nis related to reaction time") +
  scale_x_continuous(expand = expansion(mult=c(0.15,0))) +
  scale_y_continuous(expand = expansion(mult=c(0.12,0))) +
  theme_classic()

patchwork <- Plot1 + Plot2 + Plot3 + Plot4 + Plot5
patchwork <- (Plot1 | Plot2 | Plot3 | Plot4 | Plot5)
patchwork + plot_annotation(tag_levels = "A") &
  theme(plot.tag = element_text(face="bold"))
## `geom_smooth()` using formula 'y ~ x'

Grading rubric

References

Majaj, Najib J, Ha Hong, Ethan A Solomon, and James J DiCarlo. 2015. “Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance.” Journal of Neuroscience 35 (39): 13402–18.
Nolan, Deborah, and Jamis Perrett. 2016. “Teaching and Learning Data Visualization: Ideas and Assignments.” The American Statistician 70 (3): 260–69. https://doi.org/10.1080/00031305.2015.1123651.
Rajalingham, Rishi, Elias B Issa, Pouya Bashivan, Kohitij Kar, Kailyn Schmidt, and James J DiCarlo. 2018. “Large-Scale, High-Resolution Comparison of the Core Visual Object Recognition Behavior of Humans, Monkeys, and State-of-the-Art Deep Artificial Neural Networks.” Journal of Neuroscience 38 (33): 7255–69.
Yamins, Daniel LK, and James J DiCarlo. 2016. “Using Goal-Driven Deep Learning Models to Understand Sensory Cortex.” Nature Neuroscience 19 (3): 356.