Touch data analysis: deep dive on subject 15

TLDR: the important findings so far

What do the S15 data show us so far:

Time effect: The touching rate decreases linearly as the conditions progress.
Spatial effect: The participant is more likely to touch countries closer to their right hand and less likely to touch countries farther from their right hand. This is especially clear in conditions 1 and 3. The pattern in condition 2 is less clear.
Spatial effect (outlier): Poland seems to be an outlier for this spatial effect, receiving more touches than expected based on its location. This is especially for condition 1. South Korea is an outlier for condition 3, also receiving more touches than expected based on its location.
Interaction types: When interaction behaviors are considered by question and scaled by time, there appears to be a pattern by condition. Questions in condition 1 have a high touch rate and a low-mid gesture rate. Questions in condition 2 have a low-mid touch rate and a high gesture rate. Questions in condition 3 have a low-mid touch rate and a low-mid gesture rate
Touch types: Condition 2 might have a notably higher amount of point touches compared to the other conditions. This is true in absolute counts and will likely still hold when time adjusted or total condition count adjusted.
Touch types: Condition 1 might have the most diameter, radius, and perimeter touches, but the overall numbers are small and likely the claim will be even weaker when adjusted for total touches in the condition.

What is interesting, what could be followed up on:

dig into the small scale
dig into individual questions more
dig into journey between questions (order of touches, like those eye tracking ones)
dig into hand shapes?
dig into commonalities in questions where there are no touches? or many touches?
dig into touch time…when are people just leaving their hand on the model so fewer touches are counted
maybe something about fixing up the rings touched data…

0. Prior data: What non-video data do we have on S15 from the previous paper?

Here are the B+W processed images for S15 by condition (1 left, 2 center, 3 right):

“Condition 1” “Condition 3”

Condition 1	Condition 2	Condition 3
0.1715	0.0790	0.0986
0.2788	0.1432	0.1757

The first row is the touch ratio for the condition. The second row is the touch ratio for the country part of the model for the condition (i.e. only consider the the circles, not on the “non data” part of the model).

Also consider the composit touch image for S13 across all three conditions. The lightest grey areas are touched in only one condition, the darkest grey areas are touched in all three conditions.

“Condition 3”

Here is the overlay of conditions image for S15 and a Venn diagram of S15’s time touching, talking, and touching + talking:

“Condition 3”

1. Basics: What went on during each condition?

First, look at the amount of time spend by S15 in each condition. We see that condition 1 and condition 2 took about the same amount of time. Condition 3 took about twice as long as conditions 1 and 2.

Then, consider the number of touches and gestures by S15 in each condition. We can see that there are always (a lot) more touches than gestures for every condition. Condition 1 has the most touches and the fewest gestures. Condition 2 has the fewest touches and the most gestures–though not by much, it is pretty similar to condition 3.

This is notable since condition 3 is by far the longest condition–so it has the fewest touches and gestures when time adjusted. This is very visible in the time-adjusted graph below–it looks like an almost perfect linear drop in number of touches from one condition to the next.

In the past paper, there was some evidence that people’s “extra” time in condition 3 mostly went towards talking activities, so maybe this aligns with that. But, it also seems possible that there is some amount of sequential effect here.

Another thing to consider is the amount of time touching for each condition. It could be that one of the condition tends to have longer touches (and therefore also fewer touches). For example, there was definitely some touching-while-thinking time in condition 3. Check the data for this.

##   Var1    Var2 Freq
## 1    1 gesture   22
## 4    1   touch  175
## 2    2 gesture   54
## 5    2   touch   95
## 3    3 gesture   50
## 6    3   touch  109

## Using Var1, Var2 as id variables

## `geom_smooth()` using formula = 'y ~ x'

## 
## Call:
## lm(formula = value2 ~ conditionMidpointSec, data = subset(counts2, 
##     counts2$Var2 == "touch"))
## 
## Residuals:
##        2        4        6 
##  0.02603 -0.04439  0.01836 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)
## (Intercept)           3.743e-01  6.128e-02   6.108    0.103
## conditionMidpointSec -1.866e-04  6.362e-05  -2.933    0.209
## 
## Residual standard error: 0.05463 on 1 degrees of freedom
## Multiple R-squared:  0.8959, Adjusted R-squared:  0.7917 
## F-statistic: 8.604 on 1 and 1 DF,  p-value: 0.2092

Consider time and order of events

How did the touches and gestures play out over time? Were there times with tons of touches and times with nothing, or was everything evenly distributed? Did S15 go back and forth between touching and gesturing or was it mostly one or the other?

We can see in the graphs below that there were significant chunks of time in condition 3 with no touching happening. There are also significant blocks with no gestures happening. When touches are happening, they seem at a similar density as the other conditions, there is just way more “empty space” in condition 3. What is happening during that time? Perhaps S15 or the researcher is talking? Look into the data on this.

It also appears that perhaps there are different patterns going on for each question in terms of when the touches and gestures occur and how/if they overlap. Maybe it is interesting to take one of these timelines and annotate it with images from the video to visually characterize the touches and gestures.

2. Breaking things down by country

Consider the spatial distribution of touches and gestures by S15 for each condition. Is there a preference for a certain region of the model? Countries in the graph are ordered by model row, moving from bottom to top and from right to left along each row.

At first glance, the gesture story is a bit unclear–the US and Poland are standouts in gaining more attention, but otherwise the gestures seem fairly evenly spread. There are quite a few off-model and multiple country gestures that might complicate the picture–or show how gestures are used in a more general way.

The touch story does seem to show preference for countries close at hand, especially in condition 1. It’s almost a perfect linear decline for condition 1 as you move across and up the rows and further away from the bottom right. Poland is again an outlier. Condition 3 shows a similar pattern with South Korea (center of the model) as an additional outlier. Condition 2 is more ambivalent.

Consider relationships between touches and gesture for each condition and country. Do these two ways of “paying attention” correlate? I.e. if someone is touching the US a lot in one condition, are they also gesturing about it a lot?

Hard to say what’s going on here. The small number of individual-country gestures for each condition make this kind of muddy. Maybe there’s something interesting going on with the US and the UK, but overall it seems like there’s not enough data to say much.

## Using n as value column: use value.var to override.

3. Breaking things down by question

Did certain questions within each condition lead to more or less touched? What countries were touched during each question?

Condition 3 had the most questions. Condition 1 and condition 3 both had a couple fairly long questions (high question time). There were a few shorter questions in each condition.

Based on the results below, the number of touches per question seems to be very time driven. That is, the basic shape of the question time bar graph is quite similar to the basic shape of the number of touches graph. So, it could be that the touch rate stayed fairly consistent across the questions, and it was the the length of time spent on the question that changed, resulting in more or less touches. Evaluate this possibility in the next section.

## Using n as value column: use value.var to override.

3b. Breaking things down by question AND considering time

Comparing the two bar graphs above, it appears that question time was a large driver of the number of touches (and gestures). That is, it may be that S15 was touching/gesturing at a consistent rate throughout the experiment, but that some questions just have more touches/gestures because they took longer. Here, we adjust the touch and gesture counts by question time (in seconds–double check this) to see if there are differences in the touch rates and the gesture rates by question.

Based on the bar graph, the time adjusted data show the same overall pattern of the highest touch rate for condition 1 questions and the lowest touch rate for condition 3 questions. The highest touch rate came in question 1.4 “Do any of the features suggest that there’s some kind of system behind it?”. There were a number of questions with zero touches, particularly when the model was re-introduced to the subject for each task. Questions 3.2 (“do you know what excess mortality is?”) and 3.9 (“9. what’s going on in the different countries? What’s surprising? What matches what you know about that period?”) also had a touch rate of zero.

Based on the scatterplot, we do see some grouping of the questions by condition. One the whole…

Condition 1 questions are in the top left (high touch rate, low to mid gesture rate)
Condition 2 questions are in the low-middle right (low-mid touch rate, high gesture rate)
Condition 3 questions are in the low-middle left (low-mid touch rate, low-mid gesture rate)

## Using nAdjusted as value column: use value.var to override.

4. Breaking things down by types of touches

Were certain ways of touching the model more frequent in particular conditions? These data are a first pass (and may need some significant cleaning up) but might provide a starting point for digging in further. We do not consider the gestures here.

At first glance, there’s nothing super striking about the data (see graphs below). Condition 2 does seem to have a preference for point touches over dragging touches, while conditions 1 and 3 are more evenly split. Unclear if this make any sense reality-wise? Consider looking at this data by question next. Note that for this analysis “point touch” includes other similar one-location touches (e.g. touches coded as “tapping”).

Considering the touch vectors (radius, diamter, perimeter), overall the sample size is fairly low. Condition 1 does have quite a lot (in both the absolute and relative sense) of perimeter touches. Other than that…TBD

[NOTE: right now the point touches are labeled as long but they might actually not be–revisit this] [NOTE: my way of capturing which ring on the model was touched is non-ideal. Reconsider how to categorize the five rings.]

##   Var1           Var2 Freq
## 1    1 dragging touch   80
## 4    1          OTHER    0
## 7    1    point touch   95
## 2    2 dragging touch   22
## 5    2          OTHER    0
## 8    2    point touch   73
## 3    3 dragging touch   37
## 6    3          OTHER   19
## 9    3    point touch   53

## Using Var1, Var2 as id variables

##    Var1      Var2 Freq
## 1     1  diameter   14
## 4     1     OTHER  135
## 7     1 perimeter   17
## 10    1    radius    9
## 2     2  diameter    4
## 5     2     OTHER   88
## 8     2 perimeter    1
## 11    2    radius    2
## 3     3  diameter    1
## 6     3     OTHER  100
## 9     3 perimeter    3
## 12    3    radius    5

## Using Var1, Var2 as id variables