Preliminary Analysis

To begin I am working with the data from just one participant run through a pilot of the experiment.

#read in data1
results <- read.csv("DS_test_2016_Dec_11_1505.csv")

#remove extraneous rows
results <- results[-c(1:4), -c(1:23)]

#create non-multiplier data frame
resultsNM <- results[c(1:100), c(1:10) ]

#create multiplier data frame
resultsM <- results[c(102:401), c(1:15)]
resultsM <- resultsM[,-c(9:12)]

#renameSomeColumns
library(plyr)
resultsM <- rename(resultsM, c("numberEntry_4.keys"="acceptReject", "numberEntry_4.rt"="decisionTime"))

#clean up acceptReject column
index <- c("['f']", "['j']")
values <- c(1, 0) #where 1 is accept and 0 is reject
resultsM$acceptReject <- values[match(resultsM$acceptReject, index)]

#clean up Flip column
index <- c("True", 1, 2)
values <- c(1, 0, 0) #where 1 is accept and 0 is reject
resultsM$flip <- values[match(resultsM$flip, index)]

#make summed value a value to 2 decimal places
resultsM$summedVal <- format(round(resultsM$summedVal, 2), nsmall = 2)
#make numeric
resultsM$summedVal <- as.numeric(resultsM$summedVal)

#remove brackets on RT column
resultsM$decisionTime <- gsub("\\[|\\]", "", resultsM$decisionTime)
#make Numeric
resultsM$decisionTime <- as.numeric(resultsM$decisionTime)

#clean outliers due to distraction
resultsM<- resultsM[!(resultsM$decisionTime>12),]

head(resultsM)
##     Trial correct faceVal houseVal mult1House mult2Face summedVal earnings
## 106     1       0    0.90    -0.96          3         3     -0.18    33.98
## 107     2       1   -0.22     0.30          1         3     -0.36    33.98
## 108     3       1    0.88    -0.78          1         1      0.10    34.08
## 109     4       1   -0.68     0.90          1         1      0.22    34.30
## 110     5       1   -0.08     0.94          1         1      0.86    35.16
## 111     6       1    0.98     0.74          1         1      1.72    36.88
##     flip acceptReject decisionTime
## 106    0            1     3.851775
## 107    0            0     4.521477
## 108    0            1     3.785931
## 109    0            1     4.904163
## 110    0            1     2.404273
## 111    0            1     2.603064

Just for fun curious to see what the mean and SD are…

mean(resultsM$decisionTime)
## [1] 2.870859
sd(resultsM$decisionTime)
## [1] 1.660748

And how about a histogram of decision time - which shows a fairly strong right skew. Which makes sense for reaction time (need to learn more about this).

hist(resultsM$decisionTime, breaks = 50)
abline(v = mean(resultsM$decisionTime),
 col = "royalblue",
 lwd = 2)

#median line
abline(v = median(resultsM$decisionTime),
 col = "red",
 lwd = 2)
#mean line
legend(x = "topright",
 c(as.expression(bquote(Mean == .(mean(resultsM$decisionTime)))), as.expression(bquote(Median == .(median(resultsM$decisionTime))))),
 col = c("royalblue", "red"),
 lwd = c(2, 2, 2))

Plotting the summed value of the two images (face and house) against decision time shows that for this subject decisions took longer that were closer to a zero value (which makes sense), and, interestingly, the most difficult decisions seem to be those that were slightly negative.

library(ggplot2)

ggplot(resultsM, aes(x=summedVal, y=decisionTime)) +
  geom_point(shape=1) +    # Use hollow circles
  geom_smooth()            # Add a loess smoothed fit curve with confidence region


And here we plot summed value vs correct/incorrect decisions.

ggplot(resultsM, aes(x=summedVal, y=correct)) +
  geom_point(shape=1) +    # Use hollow circles
  geom_smooth()            # Add a loess smoothed fit curve with confidence region


Summed Value vs. Accept/Reject Decision

But this probably makes more sense as a logistic regression curve

plot(resultsM$summedVal, resultsM$acceptReject)

model <- glm(acceptReject ~ summedVal, data=resultsM, family=binomial(link = logit))
summary(model)
## 
## Call:
## glm(formula = acceptReject ~ summedVal, family = binomial(link = logit), 
##     data = resultsM)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.37747  -0.14826   0.00632   0.23223   2.62929  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.4072     0.2352   1.731   0.0834 .  
## summedVal     4.9124     0.6549   7.501 6.31e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 411.68  on 298  degrees of freedom
## Residual deviance: 119.20  on 297  degrees of freedom
## AIC: 123.2
## 
## Number of Fisher Scoring iterations: 7
xv <- seq(min(resultsM$summedVal), max(resultsM$summedVal), 0.01)
yv <- predict(model,list(summedVal=xv), type="response")

lines(xv,yv)

#Find the inflection point where there is a 50/50 probability of subject accepting.
p <- 0.5
x <- (log(p/(1-p)) - coef(model)[1]) / coef(model)[2]
x
## (Intercept) 
## -0.08288198


“Flip” trials were inserted in which the summed value with mulipliers on the values of one or both images, had the opposite sign (positive or negative) than it would have had with no multipliers. A box plot that shows flip vs. non flip trials.

#Box plot of flip trials vs non flip decision Time
p <- ggplot(resultsM, aes(factor(flip), decisionTime))
p + geom_boxplot()

A T-test to test whether the difference is significant:

#T-test compare "flips" to non flips
t.test((resultsM$decisionTime[resultsM$flip == 1]),(resultsM$decisionTime[resultsM$flip == 0]))
## 
##  Welch Two Sample t-test
## 
## data:  (resultsM$decisionTime[resultsM$flip == 1]) and (resultsM$decisionTime[resultsM$flip == 0])
## t = 1.0552, df = 14.671, p-value = 0.3084
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.4406094  1.3012628
## sample estimates:
## mean of x mean of y 
##  3.281037  2.850710

So, at 0.31, no, not significant. But also keep in mind there were only 12 flip trials, so with additional participants I am confident that this p value will drop - especially considering that there is a difference in means of almost 0.33 seconds.

For example, if we do a T-test comparing the means of decisions made with NO multipliers vs. decisions made when there was at least one multiplier value the difference is significant at p = 0.05.

#T-Test that checks if the means of decision time with NO multipliers and at least ONE multiplier varies
t.test((resultsM$decisionTime[resultsM$mult2Face == 1 & resultsM$mult1House ==1]),(resultsM$decisionTime[resultsM$mult2Face > 1 || resultsM$mult1House >1]))
## 
##  Welch Two Sample t-test
## 
## data:  (resultsM$decisionTime[resultsM$mult2Face == 1 & resultsM$mult1House ==  and (resultsM$decisionTime[resultsM$mult2Face > 1 || resultsM$mult1House >     1]) and     1])
## t = -2.494, df = 391.84, p-value = 0.01304
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.62035588 -0.07343661
## sample estimates:
## mean of x mean of y 
##  2.523963  2.870859

There is a time difference of about 0.35 seconds here too, similar to the flip trial difference, but it is NOW significant at p = 0.01 (due to more trials).

Linear model with weights and intercepts

fit <- lm(acceptReject ~ faceVal + houseVal, data=resultsM)
summary(fit) 
## 
## Call:
## lm(formula = acceptReject ~ faceVal + houseVal, data = resultsM)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.92123 -0.23056  0.00478  0.24601  0.86137 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.55352    0.01940  28.536   <2e-16 ***
## faceVal      0.55287    0.03383  16.341   <2e-16 ***
## houseVal     0.29114    0.03212   9.064   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3349 on 296 degrees of freedom
## Multiple R-squared:  0.5517, Adjusted R-squared:  0.5486 
## F-statistic: 182.1 on 2 and 296 DF,  p-value: < 2.2e-16

This seems to show that there is a stronger weighting toward faces than houses in terms of making decisions based on the beta weights of 0.55 vs 0.29.

Linear model if there is a multiplier on FACES

fit <- lm(acceptReject ~ faceVal + houseVal, data=subset(resultsM,mult2Face>1 & mult1House==1))
summary(fit) 
## 
## Call:
## lm(formula = acceptReject ~ faceVal + houseVal, data = subset(resultsM, 
##     mult2Face > 1 & mult1House == 1))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.4973 -0.2159 -0.0209  0.2293  0.6536 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.49580    0.03386  14.643   <2e-16 ***
## faceVal      0.81190    0.06626  12.253   <2e-16 ***
## houseVal     0.01479    0.05331   0.277    0.782    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2786 on 67 degrees of freedom
## Multiple R-squared:  0.6989, Adjusted R-squared:   0.69 
## F-statistic: 77.77 on 2 and 67 DF,  p-value: < 2.2e-16


Having a multiplier on only faces increases weighting on faces to 0.81.

Linear model if there is a multiplier on HOUSES

fit <- lm(acceptReject ~ faceVal + houseVal, data=subset(resultsM,mult1House>1 & mult2Face ==1))
summary(fit)
## 
## Call:
## lm(formula = acceptReject ~ faceVal + houseVal, data = subset(resultsM, 
##     mult1House > 1 & mult2Face == 1))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.69996 -0.19948  0.00803  0.19463  0.63983 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.55514    0.04229  13.126  < 2e-16 ***
## faceVal      0.13785    0.07193   1.917   0.0603 .  
## houseVal     0.69447    0.07612   9.123 9.72e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3181 on 57 degrees of freedom
## Multiple R-squared:  0.6155, Adjusted R-squared:  0.602 
## F-statistic: 45.62 on 2 and 57 DF,  p-value: 1.48e-12


Having a multiplier on only houses increases weighting of houses to 0.58372

The standard package of plots

fit <- lm(acceptReject ~ faceVal + houseVal, data=resultsM)
summary(fit) 
## 
## Call:
## lm(formula = acceptReject ~ faceVal + houseVal, data = resultsM)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.92123 -0.23056  0.00478  0.24601  0.86137 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.55352    0.01940  28.536   <2e-16 ***
## faceVal      0.55287    0.03383  16.341   <2e-16 ***
## houseVal     0.29114    0.03212   9.064   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3349 on 296 degrees of freedom
## Multiple R-squared:  0.5517, Adjusted R-squared:  0.5486 
## F-statistic: 182.1 on 2 and 296 DF,  p-value: < 2.2e-16
plot(fit)




Okay, that’s all for now, but interested to hear any ideas you may have going forward as the real experimental phase (behavioral followed by fMRI) will begin in late January.