Sol-Gating Analysis

This script analyzes data for sol-gating experiment. The goal of this analysisis to find the precise gate when participants can reliably identify the sign for each sign used in the SOL stimulus set.

Descriptives

df %>%
    group_by(id) %>%
    mutate(num_trials = max(trial)) %>%
    select(id, gender, age, asl_fluency, age_learned_asl, num_trials) %>%
    distinct()

## Source: local data frame [9 x 6]
## Groups: id
## 
##   id gender age asl_fluency age_learned_asl num_trials
## 1 20 Female  47           2              13        228
## 2 21 Female  37           2       18 months        228
## 3 22 Female  20           2              18        228
## 4 40 Female  25           3           birth        228
## 5 41 Female  37           2              19        228
## 6 46 Female  27           3               5        228
## 7 49   Male  50           2              13        228
## 8 12   Male  29           2               8        228
## 9 NA     NA  NA          NA              NA         NA

Histogram of main outcome variable –> Correct on 2-AFC measure

qplot(x=correct, data=df)

Histogram rt just to make sure nothing weird is going on

qplot(x=rt, data=df)

## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

Flag and filter bad RTs

df <- df %>%
    filter(id != "NA") %>% 
    mutate(include_good_rt = ifelse(log(rt) > mean(log(rt)) + 2 * sd(log(rt)) |
                                        log(rt) < mean(log(rt)) - 2 * sd(log(rt)),
                                    "exclude", "include"))

df %>% group_by(include_good_rt) %>% summarise(n())

## Source: local data frame [2 x 2]
## 
##   include_good_rt  n()
## 1         exclude   44
## 2         include 1780

df <- filter(df, include_good_rt == "include")

Main analysis

Plot accuracy for each gate within each sign.

ms <- df %>%
    na.omit() %>% 
    group_by(gate_name, gate_num, gate_signer, gate) %>%
    summarise(mean_correct = mean(correct),
              ci_h = ci.high(correct),
              ci_l = ci.low(correct))

Now plot

qplot(x=gate_num, y=mean_correct, color = gate_signer, data=ms) +
    facet_wrap( ~ gate_name,ncol=10) +
    geom_line() +
    geom_pointrange(aes(ymin=mean_correct - ci_l, 
                        ymax=mean_correct + ci_h), 
                    width = .05, size=0.6) +
    theme_bw()

Compute empirical F0

Decisions about gate were made by VM/KM. Decision criteria was as follows: - Earliest gate that achieved accuracy above chance

Note: there were some signs that did not have a clear gate based on the data. In these cases we stick with the experimenter defined F0.

#grab gate names
df_empricial_f0 <- ms %>% 
    ungroup() %>% 
    select(gate_name) %>% 
    unique()

# create vector with gate decisions 
gate_decision <- c(4, 5, 3, 4, 1, 4, 5, 5, 4, 1, 4, 5, 2, 6, 6, 
                    4, 4, 5, 3, 6, 5, 2, 3, 6, 5, 3, 5, 4, 6, 6, 6, 
                    5, 4, 3, 3, 3, 6, 4)
# check length: should be 38
length(gate_decision)

## [1] 38

# bind to gate names 
df_empricial_f0 %<>% cbind(gate_decision)

Take the gate decisions and add this information to the larger summarized data frame: ms. Then only keep the gates where there is a match between gate_decision and gate_num variables.

ms_gate_decisions <- ms %>% 
    left_join(y = df_empricial_f0, by = "gate_name") %>%
    ungroup() %>% 
    select(gate_name, gate_num, gate, gate_decision) %>% 
    filter(gate_decision == gate_num)

Extract the frame information from the gate variable. And convert to ms.

regexp <- "[[:digit:]]"

str_locate(ms_gate_decisions$gate, regexp)[1]

## [1] 13

ms_gate_decisions %<>% 
    select(gate, gate_decision) %>% 
    mutate(
        f0_sec = as.numeric(str_extract(gate, regexp)),
        f0_frame = as.numeric(str_sub(gate, 
                           start = str_length(gate) - 3, 
                           end = str_length(gate) - 2)),
        f0_tot_frames = (f0_sec * 30) + f0_frame,
        f0_ms_1 = f0_tot_frames * 33,
        f0_ms_2 = (f0_sec * 1000) + (f0_frame * 33)
    )

Save data frame, so we can add the experimenter chosen F0

# write.csv(x = ms_gate_decisions, "sol-empirical-gate-decisions.csv", row.names = F)

Read in data frame with experimenter F0 added and compute the difference for each sign and the average difference overall.

df_final <- read.csv("sol-empirical-gate-decisions.csv")

Summarise how far off we were. First we compare the difference between empirical and experiment, if we use the following computation to get F0

\(frames * 33\)

min_diff_ms	max_diff_ms	min_diff_frames	max_diff_frames	avg_diff_ms	avg_diff_frames
7	294	0	9	127.29	3.89

Now we compare the difference if we use the following computation to get F0:

\((seconds * 1000) + (frames * 33)\)

min_diff_ms	max_diff_ms	min_diff_frames	max_diff_frames	avg_diff_ms	avg_diff_frames
0	274	0	8	107.18	3.18

ggplot2::qplot(f0_diff_empirical_experiment_2_frames, data = df_final,
               binwidth = 0.5) + 
    ylim(0, 8) +
    scale_x_continuous(limits = c(0, 8), breaks=0:8) +
    theme_bw()

Sol-Gating Analysis

Kyle MacDonald

April 27, 2015

Descriptives

Flag and filter bad RTs

Main analysis

Compute empirical F0