This is a markdown document for basic data analysis and visualisation. At the moment, it only includes analysis and visualisation of our Simulated Data Set
In previous RMDs, we have loaded in the raw data and sanitized it, adding some coding columns.
Here we will further collapse that data using melt() from the reshape package
library(tidyverse)
library(reshape2)
library(plyr)
library(doBy)
library(scales)
library(lmerTest)
CleanData <- read.csv("F:/Google Drive/GitHub Repos/Crossmodality-Toolkit/data/CleanData.csv")
simdata <- subset(CleanData, DataSet == "Simulated")
pilotdata <- subset(CleanData, DataSet == "Pilot")
Now we can start taking a look at our “Correctness” data- this involves manipulating our data frame differently than previously
Before, we had our DV (Response) in a single column (skinny format), so we just had to aggregate that column.
For Correctness, our data is in three columns (wide format), so we need to melt the dataframe
# Getting rid of numerical response column that we aren't using
CorrData <- subset(pilotdata, select = -c(Response))
# Melting data
CorrData <- melt(CorrData,
variable.name = "Prediction",
id.vars = c("DataSet", "Subject", "Condition", "TrialNum", "Inducer",
"Concurrent", "Comparison"))
# Aggregating Data
CorrDataAgg <- aggregate(value ~ Prediction + Subject + Condition + Inducer + Concurrent + Comparison,
CorrData, mean)
So that gives us correctness data from our three sets of predictions, aggregated so that it’s in a nice form for doing some GLM/LMER on
FullModel <- glmer(value ~ Prediction + (1|Subject), data = CorrDataAgg, family = binomial )
## Warning in eval(family$initialize, rho): non-integer #successes in a
## binomial glm!
summary(FullModel)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: binomial ( logit )
## Formula: value ~ Prediction + (1 | Subject)
## Data: CorrDataAgg
##
## AIC BIC logLik deviance df.resid
## 13408.4 13437.2 -6700.2 13400.4 9896
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.2210 -0.9921 0.5960 0.7622 1.1299
##
## Random effects:
## Groups Name Variance Std.Dev.
## Subject (Intercept) 0.129 0.3591
## Number of obs: 9900, groups: Subject, 61
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.22552 0.05821 3.874 0.000107 ***
## PredictionLitReview 0.05824 0.05032 1.157 0.247146
## PredictionAffect 0.01895 0.05026 0.377 0.706205
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) PrdcLR
## PrdctnLtRvw -0.431
## PrdctnAffct -0.431 0.499
anova(FullModel)
## Analysis of Variance Table
## Df Sum Sq Mean Sq F value
## Prediction 2 1.3317 0.66585 0.6658