Psychometric function (PF) and Signal Detection Theory (SDT)

The psychometric function (PF) relate thehuman performance on a given psychophysical task on a psychophysical task (e.g., classification) to some sensory inputs (e.g., stimulus intensity). Recall that a general psychometric function:

\[\Psi(x,\alpha, \beta, \gamma, \lambda) = \gamma + (1-\gamma -\lambda) F(x, \alpha, \beta)\] where \(\alpha\) is the threshold, \(\beta\) sensitivity, \(\gamma\) chance level, and \(\lambda\) lapse rate. Parameters can be estimated by using logistic regression.

High-threshold theory

Let’s imagine a simple two-interval force-choice (2IFC) task in which observers had to identify which interval contains a target. This kind of 2IFC task can be treated as signal detection. The stimulus interval is often denoted S (signal), and the blank interval is denoted N (noise). The below figure shows a probability distribution of two signals.

The relationship between SDT and the PF. Adapted from Kingdom and Prins (2016), Figure 4.7

The relationship between SDT and the PF. Adapted from Kingdom and Prins (2016), Figure 4.7

Whether or not the sensory mechnanism will detect the sitmulus on any trial is determined by the amount of sensory evidence accumulated by the sensory system. Let’s assume that the mean amount of accumulated evidence is a linear function of stimulus intensity \(x\):

\[\mu(x)=\pi + \rho x\]

According to high-threshold theory, the sensory mechanism will detect the stimulus when the amount of sensory evidence exceeds a fixed internal criterion or threshold. As its name implied, high-threshold theory assumes that the internal threshold is high. More specifically, the threshold is assumed to be high enough such that the probability that the threshold is exceedd when \(x=0\) (i.e., noise) is effective zero. The decision is based on binary information only: either the sensory evidence was in excess of the threshold, or the sensory evidence was not in excess of the threshold. Given the assumptions we have made, function \(F(x)\), which describes the probability that the threshold will be exceeded by a stimulus of intensity \(x\), will be the cumulative normal distribution (see the inset in the above Figure).

By contrast, there is no such thing as a fixed internal threshold according to SDT. Instead, SDT makes the assumption that sensory mechanism generate a grade signal for all sensory intensity, including noise. The decision process has access to the degree of sensory evidence accumulated on both Signal and Noise. We may think of any presentation of a stimulus as a sample from the probability density function associated with the stimulus. Even in the absence of a stimulus, differing degrees of sensory evidence results, and we may think of the presentation of the noise interval as a sample from the probability density function associated with the noise stimulus. Thus, the decision is based on the relative amplitude of two samples: Signal \(N(\pi + \rho x), \sigma^2\), and Noise \(N(\pi, \sigma^2)\).

For 2IFC, one simple decision rule is that if the sample taken during the sitmulus interval has a value greater than the sample taken during the noise interval. That is, if the difference between the sample value derived from the signal interval and the sample value derived from the noise interval exceed zero, the response will be correct. The difference in sensory evidence will be distributed as \(N(\rho x, 2 \sigma^2)\).

Calculation of d’ and bias for M-AFC

1. Yes/No 1AFC

Yes/No paradigm, known as 1AFC, is particularly prone to bias. Suppose two observers have the same internal sensitivity, but use different response criteria. Observed psychometric functions would differ a lot between two observers. The SDT can distinguish the response bias from the sensitivity with the following estimates:

\[d' = z(Hit) - z(FA)\]

\[c =-[z(Hit)+z(FA)]/2\]

2. Unbiased 2IFC

With the standard 2IFC procedure, the N and S+N stimuli are presented together in a trial as two alternatives. Remember that the decision rule is to choose the alternative in which the internal signal is biggest. If the observer adopts this rule, trials in which the difference between the S+N and N samples are positive will result in a correct decision. The variance of the difference is the summation of the variances of the S+N and N. The proportion correct for 2AFC is thus given by the grey area in the lower panel to the right of zero. This is:

\[ P_c = \Phi(d'/\sqrt{2})\]

and

\[ d' = z(P_c) \sqrt2\]

knitr::include_graphics('img/2afc.jpg')
Graphical illustration of how d; can be calculated for an unbiased 2AFC task. Adapted from Kingdom and Prins (2016)

Graphical illustration of how d; can be calculated for an unbiased 2AFC task. Adapted from Kingdom and Prins (2016)

For the biased 2IFC, the calculation is the same as shown in 1AFC.

Confusion Matrix and ROC curve

1AFC or 2AFC are essential binary classification. The probabilities of outcomes from S and N define the confusion matrix.

Response Signal Noise
Yes Hit FA
No Miss CR

ROC curve plots with the horizontal axis (FA) and the vertical axis (Hit). On the same ROC curve, Hit/FA with liberal criteria locate at the right-up corner, whereas Hit/FA with conservative criteria locate at the lower-left corner. Roc curves with high d’ have big area under the curve. Sometimes we also use the Area under the curve (AUC) to measure the sensitity.

Note, sometimes we also use ROC curves and AUCs to select the best logistic regression models.

Practice with the dataset of musical tempo bisection task

The dataset I provide here is from my student Phillipa’s master thesis, a study on influence of music tempo on duration judgments. The same dataset we used in the previous tutorial session.

In the study, same classic music pieces were manipulated and played in three different tempo: slow, medium, and faster. The length of the music piece was randomly select from 2 to 8 seconds. Participants had to judge if the music piece was a ‘short’ or a ‘long’ one. Participants had to estimate the ‘short’ or ‘long’ based on the past trials they received. The research question is if the music tempo alters the time judgment.

A sample experimental data that contains three participants’ responses can be found here.

Please download the data file first to your local folder. Here I put in a subfolder called ‘data’ in my local computer.

Please refer the tutorial section (Week 7) for the detailed plotting of psychometric functions and the estimation of thresholds and sensitivities. Here we use the same data, but apply SDT analysis.

Step 1. Load data

Let us first load the data:

# --- please change the location of the data file
raw_bisection <- read.csv('data/music_bisection.csv',sep=';')

# change response from 1,2 to 0,1 (short vs. long)
raw_bisection$resp <- raw_bisection$Decision -1
# convert numbers (tempo) to factor (slow, medium, fast)
raw_bisection$Tempo <- factor(raw_bisection$Tempo,labels = c('Slow','Medium','Fast'))

knitr::kable(head(raw_bisection))
Participant Duration Tempo Stimuli_Nu Decision Music_Duration resp
Sub101 2 Slow 10 1 2.001604 0
Sub101 6 Fast 10 2 6.000169 1
Sub101 5 Slow 4 2 5.000097 1
Sub101 7 Slow 8 2 7.000105 1
Sub101 2 Fast 1 1 2.000080 0
Sub101 7 Slow 6 2 7.000084 1

Step 2. Visualization

Your task: please read through the following codes and think how you will do different if you want to plot individual subject data.

raw_bisection %>% 
  group_by(Tempo, Duration) %>%  
  summarise(m = mean(resp), long = sum(resp), 
            n = n(), short = n-long) %>%
  ggplot(aes(x = Duration, y = m, color = Tempo)) + geom_point() + 
  geom_smooth(method=glm, method.args= list(family = binomial(logit)), 
              se = FALSE, aes(linetype=Tempo)) + 
  ylab('Prop. of Long responses')

Step 3. Data transformation

Note that the bisection task is to ask participants to bisect the duration sampled from 2 to 8 seconds into two categories: Short, and Long. So the middle point is 5 seconds. We are interest in durations of the short group (< 5 seconds) and the long group (>=5 seconds), and how participants classify the short and long categories. In general, we can also do MAFC with the 5 second. We assume the distance to the middle point (i.e., 5 seconds) as the difference intensity, which the participants use for bisection task. Here for the simplicity purpose, we only analyze the above two group stimuli.

So let’s calculate the two category percentage of correct responses.

Your task: Think about how to categories durations, and filter out the condition of 5 seconds.

raw_bisection %>% mutate(SL = sign(Duration - 5), 
                         intensity = abs(Duration - 5),
                         correct = resp == (Duration>5)) %>% 
  filter(SL !=0) %>% # omit the equal
  group_by(Participant, Tempo, SL, intensity) %>%
  summarise(p = mean(correct)) %>% # percentage
  spread(SL, p) %>%
  rename(Short = `-1`, Long = `1` ) -> meanP

# show the results
knitr::kable(head(meanP))
Participant Tempo intensity Short Long
Sub101 Slow 1 1 0.45
Sub101 Slow 2 1 0.80
Sub101 Slow 3 1 0.95
Sub101 Medium 1 1 0.55
Sub101 Medium 2 1 0.75
Sub101 Medium 3 1 0.85

Step 4. Calculation of the sensitivity d’ and the bias C

Recall that for the Two-alternative discrimination task, we have

\[d' = z(P_A) + z(P_B) \]

\[C = -[z(P_A) - z(P_B)]/2\] Now we can calculate individual \(d'\) and \(C\). Note, z-score (qnorm in R) will get infinite value when the probability is 0 or 1. So you need to adjust a bit, such that the probability \(p\) falls into \([0.001,0.999]\). This can be done by R functions \(pmax()\) and \(pmin()\).

meanP %>% mutate(d = qnorm(pmax(0.001, pmin(Short, 0.999))) + 
                   qnorm(pmax(0.001, pmin(Long, 0.999))), 
                 c = -(qnorm(pmax(0.001, pmin(Long, 0.999))) - 
                   qnorm(pmax(0.001, pmin(Short, 0.999))))/2) -> meanDC
knitr::kable(head(meanDC))
Participant Tempo intensity Short Long d c
Sub101 Slow 1 1 0.45 2.964571 1.6079468
Sub101 Slow 2 1 0.80 3.931853 1.1243055
Sub101 Slow 3 1 0.95 4.735086 0.7226893
Sub101 Medium 1 1 0.55 3.215894 1.4822855
Sub101 Medium 2 1 0.75 3.764722 1.2078713
Sub101 Medium 3 1 0.85 4.126666 1.0268995

Step 4: Visualization of d’ and C

Now we compare the sensitivity and critieria changes among different musical tempos.

Sensitivity d’

meanDC %>% group_by(Tempo, intensity) %>%
  summarise(md = mean(d), mc = mean(c)) -> mmDC
# compare sensitivity d'
ggplot(mmDC, aes(intensity, md, color = Tempo)) + 
  geom_point() + geom_line() +
  xlab('Duration Difference') +
  ylab("mean d'")
Sensitivity d' as a function of Music Tempo and Duration difference

Sensitivity d’ as a function of Music Tempo and Duration difference

By visual inspection, we see the sensitivity was higher in the slow tempo condition. This requires further statistical test to valide this conclusion.

Response Bias C

ggplot(mmDC, aes(intensity, mc, color = Tempo)) + 
  geom_point() + geom_line()+
  xlab('Duration Difference') +
  ylab("mean C")
Response bias C as a function of Music Tempo and Duration difference

Response bias C as a function of Music Tempo and Duration difference

Here again we see the resposne biases relative stable for the slow and faster tempos, but a bit higher for the medium temp. Higher C indicates a tendency of conservative responses (here, tend to make ‘Short’ response).