First let me outline the location of all files involved in the analysis.
I won’t be analyzing the demographic data, I’ve included just a few discriptive statistics. Additoinally, smoking status will be used to subset analyses later on.
library(Hmisc)
demographicsAll = read.csv("C:/Users/londeree.4/Dropbox/Lab-Londeree/osu-certs-pilot/experiment-surveys/Demographics/demographics.csv", stringsAsFactors = FALSE)
demographics = demographicsAll[c(-4,-7,-13,-21,-23,-26,-31,-33,-34,-38,-42,-45),]
describe(demographics)
## demographics
##
## 13 Variables 34 Observations
## ---------------------------------------------------------------------------
## StudyID
## n missing distinct
## 34 0 34
##
## lowest : CERT001 CERT002 CERT003 CERT005 CERT006
## highest: CERT040 CERT041 CERT043 CERT044 CERT046
## ---------------------------------------------------------------------------
## Age
## n missing distinct Info Mean Gmd
## 34 0 3 0.866 17.21 0.836
##
## Value 16 17 18
## Frequency 7 13 14
## Proportion 0.206 0.382 0.412
## ---------------------------------------------------------------------------
## Race
## n missing distinct
## 34 0 7
##
## African American (5, 0.147), Asian (2, 0.059), Caucasian (23, 0.676),
## Don't Know (1, 0.029), Half black, half Asian (1, 0.029), Middle eastern,
## Arab (1, 0.029), Pacific Islander (1, 0.029)
## ---------------------------------------------------------------------------
## Ethnicity
## n missing distinct
## 34 0 2
##
## Value Non-hispanic Other
## Frequency 33 1
## Proportion 0.971 0.029
## ---------------------------------------------------------------------------
## Sex
## n missing distinct
## 34 0 2
##
## Value F M
## Frequency 18 16
## Proportion 0.529 0.471
## ---------------------------------------------------------------------------
## Height
## n missing distinct Info Mean Gmd .05 .10
## 34 0 15 0.989 66.28 4.279 61.30 63.00
## .25 .50 .75 .90 .95
## 63.25 65.50 69.00 71.70 73.00
##
## Value 59.5 60.0 62.0 63.0 64.0 65.0 66.0 67.0 68.0 69.0
## Frequency 1 1 1 6 4 4 3 3 1 3
## Proportion 0.029 0.029 0.029 0.176 0.118 0.118 0.088 0.088 0.029 0.088
##
## Value 70.0 71.0 72.0 73.0 74.0
## Frequency 1 2 1 2 1
## Proportion 0.029 0.059 0.029 0.059 0.029
## ---------------------------------------------------------------------------
## Weight
## n missing distinct Info Mean Gmd .05 .10
## 34 0 21 0.997 143.9 28.42 109.0 115.0
## .25 .50 .75 .90 .95
## 130.0 140.0 158.5 170.0 180.0
##
## lowest : 105.0 111.2 115.0 116.0 125.0, highest: 160.0 165.0 170.0 180.0 250.0
## ---------------------------------------------------------------------------
## BMI
## n missing distinct Info Mean Gmd .05 .10
## 34 0 31 1 23.01 4.152 18.75 18.98
## .25 .50 .75 .90 .95
## 20.36 21.95 24.75 27.45 30.89
##
## lowest : 18.59788 18.82662 18.96608 19.00411 19.19555
## highest: 26.36250 27.43572 27.45444 30.89355 32.97992
## ---------------------------------------------------------------------------
## Edu
## n missing distinct
## 34 0 5
##
## Value 10th Grade 11th Grade 12th Grade
## Frequency 2 7 9
## Proportion 0.059 0.206 0.265
##
## Value Currently in College Graduated HS
## Frequency 9 7
## Proportion 0.265 0.206
## ---------------------------------------------------------------------------
## EduNum
## n missing distinct Info Mean Gmd
## 34 0 5 0.946 2.412 1.43
##
## Value 0 1 2 3 4
## Frequency 2 7 9 7 9
## Proportion 0.059 0.206 0.265 0.206 0.265
## ---------------------------------------------------------------------------
## EcigUse
## n missing distinct value
## 34 0 1 No
##
## Value No
## Frequency 34
## Proportion 1
## ---------------------------------------------------------------------------
## CigUse
## n missing distinct value
## 34 0 1 No
##
## Value No
## Frequency 34
## Proportion 1
## ---------------------------------------------------------------------------
## AnyUse
## n missing distinct value
## 34 0 1 No
##
## Value No
## Frequency 34
## Proportion 1
## ---------------------------------------------------------------------------
First Let’s load in the data we need. The following datasets are raw from the eyelink software. This data is in “long” format- each row is an individual trial from a subject, each column is a varible of interest. This leads to a total of 1288 rows with 125 variables.
rawDataAD is a set of data on the ads of interest for each subject in this dataset across both versions of the task. rawDataIMAGE is a set of data on the entire scene for each subject. We won’t need to use this much - it will just be to correct a few measures.
#load specific CSV file of raw data
rawDataAD = read.csv("C:/Users/londeree.4/Dropbox/Lab-Londeree/osu-certs-pilot/experiment-eyetracking/mlm/AOI Reports/ViewingPeriodAOIReportAD.csv", stringsAsFactors = FALSE)
rawDataIMAGE = read.csv("C:/Users/londeree.4/Dropbox/Lab-Londeree/osu-certs-pilot/experiment-eyetracking/mlm/AOI Reports/ViewingPeriodAOIReportIMAGE.csv")
Now, let’s clean this up. We don’t need all 125 variables for this analysis. While we are at it, let’s also correct for individuals who did not follow the task instructions. Anything that we would like to put into our model that is a factor, we should also specify
#create a corrected dwell time percent variable.
rawDataAD$IA_DWELL_TIME_PERCORRECTED <- (rawDataAD$IA_DWELL_TIME / rawDataIMAGE$IA_DWELL_TIME)
#load in some packages to tidy the data
require(dplyr)
require(tidyr)
#Select columns of interest and save to a new data frame
DataAD <- data.frame(rawDataAD[c('SUBID', 'counterbalanceversion', 'imagetype', 'scene', 'ad', 'IA_AREA', 'IA_DWELL_TIME_PERCORRECTED', 'IA_FIRST_FIXATION_DURATION', 'IA_FIRST_FIXATION_INDEX', 'IA_FIXATION_COUNT', 'IA_FIRST_FIXATION_TIME', 'IA_FIRST_RUN_DWELL_TIME')])
#Specify variables that should be factors
DataAD$facimagetype <- as.factor(DataAD$imagetype)
DataAD$ad<-as.factor(DataAD$ad)
DataAD$scene<-as.factor(DataAD$scene)
DataAD$counterbalanceversion<-as.factor(DataAD$counterbalanceversion)
DataAD$numimagetype <- as.numeric(factor(DataAD$imagetype))
#Remove individuals that smoke
demographics = demographicsAll[c(-4,-7,-13,-21,-23,-26,-31,-33,-34,-38,-42,-45),]
DataAD <- DataAD[-which(DataAD$SUBID=='4' | DataAD$SUBID=='7'| DataAD$SUBID=='13'| DataAD$SUBID=='21'| DataAD$SUBID=='23'| DataAD$SUBID=='26'| DataAD$SUBID=='31'| DataAD$SUBID=='33'| DataAD$SUBID=='34'| DataAD$SUBID=='38'| DataAD$SUBID=='42'| DataAD$SUBID=='45'), ]
#Examine the new dataset
str(DataAD)
## 'data.frame': 952 obs. of 14 variables:
## $ SUBID : int 1 1 1 1 1 1 1 1 1 1 ...
## $ counterbalanceversion : Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
## $ imagetype : chr "TAT" "FAT" "TAT" "TAT" ...
## $ scene : Factor w/ 28 levels "1","2","3","4",..: 10 11 12 13 14 15 16 17 18 19 ...
## $ ad : Factor w/ 28 levels "1","2","3","4",..: 10 11 12 13 14 15 16 17 18 19 ...
## $ IA_AREA : num 10450 26702 25564 4158 4020 ...
## $ IA_DWELL_TIME_PERCORRECTED: num 0.0439 0.0973 0.1003 0 0 ...
## $ IA_FIRST_FIXATION_DURATION: int 237 158 175 NA NA 162 NA 57 236 213 ...
## $ IA_FIRST_FIXATION_INDEX : int 25 6 2 NA NA 2 NA 21 1 3 ...
## $ IA_FIXATION_COUNT : int 1 3 3 0 0 4 0 3 2 7 ...
## $ IA_FIRST_FIXATION_TIME : int 7526 3108 2265 NA NA 2279 NA 7363 2036 2467 ...
## $ IA_FIRST_RUN_DWELL_TIME : int 237 158 546 NA NA 162 NA 610 236 1444 ...
## $ facimagetype : Factor w/ 2 levels "FAT","TAT": 2 1 2 2 1 1 2 1 2 1 ...
## $ numimagetype : num 2 1 2 2 1 1 2 1 2 1 ...
summary(DataAD)
## SUBID counterbalanceversion imagetype scene
## Min. : 1.00 1:476 Length:952 1 : 34
## 1st Qu.:11.00 2:476 Class :character 2 : 34
## Median :21.00 Mode :character 3 : 34
## Mean :22.47 4 : 34
## 3rd Qu.:35.00 5 : 34
## Max. :46.00 6 : 34
## (Other):748
## ad IA_AREA IA_DWELL_TIME_PERCORRECTED
## 1 : 34 Min. : 4020 Min. :0.00000
## 2 : 34 1st Qu.: 8967 1st Qu.:0.03364
## 3 : 34 Median :11519 Median :0.10904
## 4 : 34 Mean :17542 Mean :0.14576
## 5 : 34 3rd Qu.:22314 3rd Qu.:0.20700
## 6 : 34 Max. :78542 Max. :1.00000
## (Other):748
## IA_FIRST_FIXATION_DURATION IA_FIRST_FIXATION_INDEX IA_FIXATION_COUNT
## Min. : 20.0 Min. : 1.000 Min. : 0.000
## 1st Qu.: 166.8 1st Qu.: 3.000 1st Qu.: 1.000
## Median : 210.0 Median : 7.000 Median : 2.000
## Mean : 237.2 Mean : 8.698 Mean : 3.063
## 3rd Qu.: 260.2 3rd Qu.:13.250 3rd Qu.: 4.000
## Max. :2700.0 Max. :27.000 Max. :22.000
## NA's :204 NA's :204
## IA_FIRST_FIXATION_TIME IA_FIRST_RUN_DWELL_TIME facimagetype numimagetype
## Min. :2031 Min. : 33.0 FAT:476 Min. :1.0
## 1st Qu.:2480 1st Qu.: 237.0 TAT:476 1st Qu.:1.0
## Median :3643 Median : 440.0 Median :1.5
## Mean :3999 Mean : 635.9 Mean :1.5
## 3rd Qu.:5244 3rd Qu.: 818.2 3rd Qu.:2.0
## Max. :7999 Max. :5948.0 Max. :2.0
## NA's :204 NA's :204
Let’s look at a TTest analysis of these differences
library(ggplot2)
t.test(DataAD$IA_DWELL_TIME_PERCORRECTED~DataAD$numimagetype)
##
## Welch Two Sample t-test
##
## data: DataAD$IA_DWELL_TIME_PERCORRECTED by DataAD$numimagetype
## t = 5.3604, df = 892.65, p-value = 1.058e-07
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.03260812 0.07027830
## sample estimates:
## mean in group 1 mean in group 2
## 0.1714851 0.1200419
qplot(factor(numimagetype),IA_DWELL_TIME_PERCORRECTED, data= DataAD, geom = c("boxplot"))
t.test(DataAD$IA_FIXATION_COUNT~DataAD$numimagetype)
##
## Welch Two Sample t-test
##
## data: DataAD$IA_FIXATION_COUNT by DataAD$numimagetype
## t = 4.676, df = 884.72, p-value = 3.381e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.5412615 1.3242847
## sample estimates:
## mean in group 1 mean in group 2
## 3.529412 2.596639
qplot(factor(numimagetype),IA_FIXATION_COUNT, data= DataAD, geom = c("boxplot"))
t.test(DataAD$IA_FIRST_FIXATION_TIME~DataAD$numimagetype)
##
## Welch Two Sample t-test
##
## data: DataAD$IA_FIRST_FIXATION_TIME by DataAD$numimagetype
## t = 0.98326, df = 744.61, p-value = 0.3258
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -117.0966 352.0920
## sample estimates:
## mean in group 1 mean in group 2
## 4055.702 3938.204
qplot(factor(numimagetype),IA_FIRST_FIXATION_TIME, data= DataAD, geom = c("boxplot"))
## Warning: Removed 204 rows containing non-finite values (stat_boxplot).
OK, so it looks like we have a significant effect of both % dwell time and fixation count. First fixation time is non-significant but it looks like it’s in the right direction.
Note that these the first fixation time has 284 NA values for trials that were never fixated on. Consider adding this into the model?
Let’s notice the relationship between fixation count and dwell time
cor.test(DataAD$IA_DWELL_TIME_PERCORRECTED,DataAD$IA_FIXATION_COUNT)
##
## Pearson's product-moment correlation
##
## data: DataAD$IA_DWELL_TIME_PERCORRECTED and DataAD$IA_FIXATION_COUNT
## t = 63.302, df = 950, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8861721 0.9106049
## sample estimates:
## cor
## 0.8990864
r <- round(cor(DataAD$IA_DWELL_TIME_PERCORRECTED,DataAD$IA_FIXATION_COUNT), 4)
r<- paste('r =',r)
p <- round(cor.test(DataAD$IA_DWELL_TIME_PERCORRECTED,DataAD$IA_FIXATION_COUNT)$p.value, 4)
p <- paste('p =',p)
qplot(IA_DWELL_TIME_PERCORRECTED, IA_FIXATION_COUNT, data = DataAD, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', .5,16,label= r)+annotate('text', .5,18,label= p)
They are very correlated! This is unsurprising but may be relevant to inlcuding one over the other.
First, let’s quickly load in the lme4 package and a few others.
pbkrtest is Parametric Bootstrap and Kenward Roger Based Methods for Mixed Model Comparison.
lmerTest performs different types of tests on lmer objects. Here I use it to display the P values for the tests we use.
arm provides functions to accompany A. Gelman and J. Hill, Data Analysis Using Regression and Multilevel/Hierarchical Models, Cambridge University Press, 2007
AICcmodavg gives more functions to implement model selection and multimodel inference based on Akaike’s information criterion.
lattice and ggplot2 are graphing functions.
library(lme4)
library(multilevel)
library(pbkrtest)
library(lmerTest)
library(ggplot2)
library(arm)
library(AICcmodavg)
library(lattice)
Why should we even use MLM for this data set?
# Used to strip down figures to make them simpler
fig <- theme_bw() + theme(panel.grid.minor=element_blank(), panel.grid.major=element_blank(), panel.background=element_blank()) +
theme(strip.background=element_blank(), strip.text.y = element_text()) + theme(legend.background=element_blank()) +
theme(legend.key=element_blank()) + theme(panel.border = element_rect(colour="black", fill=NA))
####Dwell Time Percent######
# Make the followoing three plots to explore the data
DTplot <- ggplot(aes(IA_DWELL_TIME_PERCORRECTED,imagetype),data=DataAD) + stat_sum(aes(size = ..n.., group = 1)) +
scale_size_area(max_size=5)
# Plot 1 - All Data
DTplot + geom_point() + xlab("Dwell Time %") + ylab("Image Type") + labs(title="All Data") + fig
# Plot 1 - All Data
xyplot(IA_DWELL_TIME_PERCORRECTED~facimagetype,data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Dwell Time")
# Plot 2 - By Counterbalance Version
xyplot(IA_DWELL_TIME_PERCORRECTED~facimagetype|as.factor(counterbalanceversion),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Dwell Time")
# Plot 3 - By Subject
xyplot(IA_DWELL_TIME_PERCORRECTED~facimagetype|as.factor(SUBID),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Dwell Time")
# Plot 4 - By Scene
xyplot(IA_DWELL_TIME_PERCORRECTED~facimagetype|as.factor(scene),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Dwell Time")
# Plot 5 - By Ad - UHOH!
xyplot(IA_DWELL_TIME_PERCORRECTED~facimagetype|as.factor(ad),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Dwell Time")
####Fixation Count#####
# Make the followoing three plots to explore the data
FCplot <- ggplot(aes(IA_FIXATION_COUNT,imagetype),data=DataAD) + stat_sum(aes(size = ..n.., group = 1)) +
scale_size_area(max_size=5)
# Plot 1 - All Data
FCplot + geom_point() + xlab("Fixation Count") + ylab("Image Type") + labs(title="All Data") + fig
# Plot 1 - All Data
xyplot(IA_FIXATION_COUNT~facimagetype,data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Fixation Count")
# Plot 2 - By Counterbalance Version
xyplot(IA_FIXATION_COUNT~facimagetype|as.factor(counterbalanceversion),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Fixation Count")
# Plot 3 - By Subject
xyplot(IA_FIXATION_COUNT~facimagetype|as.factor(SUBID),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Fixation Count")
# Plot 4 - By Scene
xyplot(IA_FIXATION_COUNT~facimagetype|as.factor(scene),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Fixation Count")
####Time to First Fixation#####
# Make the followoing three plots to explore the data
FTplot <- ggplot(aes(IA_FIRST_FIXATION_TIME,imagetype),data=DataAD) + stat_sum(aes(size = ..n.., group = 1)) +
scale_size_area(max_size=5)
# Plot 1 - All Data
FTplot + geom_point() + xlab("Time to First Fixation") + ylab("Image Type") + labs(title="All Data") + fig
## Warning: Removed 204 rows containing non-finite values (stat_sum).
## Warning: Removed 204 rows containing missing values (geom_point).
# Plot 1 - All Data
xyplot(IA_FIRST_FIXATION_TIME~facimagetype,data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Time to First Fixation")
# Plot 2 - By Counterbalance Version
xyplot(IA_FIRST_FIXATION_TIME~facimagetype|as.factor(counterbalanceversion),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Time to First Fixation")
# Plot 3 - By Subject
xyplot(IA_FIRST_FIXATION_TIME~facimagetype|as.factor(SUBID),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Time to First Fixation")
# Plot 4 - By Scene
xyplot(IA_FIRST_FIXATION_TIME~facimagetype|as.factor(scene),data=DataAD[1:1582,], type=c("p","g","r"),col="dark blue",col.line="black", xlab="Image Type", ylab="Time to First Fixation")
Because multilevel modeling involves predicting variance at different levels, it is important to begin analyses by determining the levels at which significant variation exists.
We first examine the group-level properties of the outcome variable to estimate the ICC(1). Second, we determine whether the variance of the intercept (t00) is significantly larger than zero. ###Calculating ICC(1)
Null.Model1<-lme(IA_DWELL_TIME_PERCORRECTED~1,random=~1|scene,data=DataAD,
control=list(opt="optim"))
VarCorr(Null.Model1)
## scene = pdLogChol(1)
## Variance StdDev
## (Intercept) 0.006387105 0.07991937
## Residual 0.016393633 0.12803762
Null.Model2<-lme(IA_FIXATION_COUNT~1,random=~1|scene,data=DataAD,
control=list(opt="optim"))
VarCorr(Null.Model2)
## scene = pdLogChol(1)
## Variance StdDev
## (Intercept) 3.458059 1.859586
## Residual 6.340400 2.518015
#Null.Model3<-lme(IA_FIRST_FIXATION_TIME~1,random=~1|scene,data=DataAD,control=list(opt="optim"), na.omit)
#VarCorr(Null.Model3)
The estimate of t00 (between-group or Intercept variance) is 0.006, and the estimate of sigma^2 (within-group or residual variancel) is 0.0164.
Now from this information we can calculate ICC by hand
DTICC <- 0.006544322/(0.006544322+0.016431308)
FCICC <- 3.424397/ (3.424397+6.247567)
The ICC estimate (t00/(t00+sigma^2)) is .2848
To verify that the ICC results from the mixed-effects models are similar to those from an ANOVA model and the ICC1 function we can perform an ANOVA analysis on the same data.
tmod1<-aov(IA_DWELL_TIME_PERCORRECTED~as.factor(scene),data=DataAD)
ICC1(tmod1)
## [1] 0.2803731
tmod2<-aov(IA_FIXATION_COUNT~as.factor(scene),data=DataAD)
ICC1(tmod2)
## [1] 0.3529187
These values are indeed close to the one in our model.
We would likely be interested in knowing whether the intercept variance (i.e.,t00) estimate of .0065 is significantly different from zero. In mixed-effects models, we perform this test by comparing -2 log likelihood values between (1) a model with a random intercept, and (2) a model without a random intercept.
Null.gls1<-gls(IA_DWELL_TIME_PERCORRECTED~1,data=DataAD,
control=list(opt="optim"))
logLik(Null.gls1)*-2
## 'log Lik.' -900.1477 (df=2)
logLik(Null.Model1)*-2
## 'log Lik.' -1132.024 (df=3)
-1209.638--1548.265
## [1] 338.627
anova(Null.gls1, Null.Model1)
## Model df AIC BIC logLik Test L.Ratio
## Null.gls1 1 2 -896.1477 -886.4327 450.0739
## Null.Model1 2 3 -1126.0242 -1111.4516 566.0121 1 vs 2 231.8765
## p-value
## Null.gls1
## Null.Model1 <.0001
Null.gls2<-gls(IA_FIXATION_COUNT~1,data=DataAD,
control=list(opt="optim"))
logLik(Null.gls2)*-2
## 'log Lik.' 4864.358 (df=2)
logLik(Null.Model2)*-2
## 'log Lik.' 4542.383 (df=3)
-1209.638--1548.265
## [1] 338.627
anova(Null.gls2, Null.Model2)
## Model df AIC BIC logLik Test L.Ratio p-value
## Null.gls2 1 2 4868.358 4878.073 -2432.179
## Null.Model2 2 3 4548.383 4562.955 -2271.191 1 vs 2 321.9747 <.0001
The -2 log likelihood value for the gls model without the random intercept is -1209.638. The -2 log likelihood value for the model with the random intercept is -1548.265 The difference of 338.627 is significant on a Chi-Squared distribution with one degree of freedom (one model estimated a variance term associated with a random intercept, the other did not, and this results in the one df difference). These results indicate significant intercept variation.
Now that we have examined within group variance, let’s try and model various intercept and slope models to capture between-group variance
# List of all Potential models-->
# Note: you can chose to not code ones that do not make sense.
# Linear model with no random effects
DTM0<-lm(IA_DWELL_TIME_PERCORRECTED~imagetype,data=DataAD)
#What about just using counterbalance to combine across ad and scene?
DTM1<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1|counterbalanceversion) + (1|SUBID), data=DataAD, REML=FALSE)
DTM2<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1+imagetype|counterbalanceversion) + (1|SUBID), data=DataAD, REML=FALSE)
DTM3<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1|counterbalanceversion) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
DTM4<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1+imagetype|counterbalanceversion) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts
DTM5<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1|scene) + (1|ad) + (1|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts and slopes (This is no good!)
# No Scene, varying intercepts only
DTM6<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1|ad) + (1|SUBID), data=DataAD, REML=FALSE)
# No Ad, varying intercepts only
DTM7<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1|scene) + (1|SUBID), data=DataAD, REML=FALSE)
# No Subjects, varying intercepts only
DTM8<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1|scene) + (1|ad), data=DataAD, REML=FALSE)
# No Ad, varying intercepts and slopes
DTM9<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1+imagetype|scene) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts and slopes only varying by scene
DTM10<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1+imagetype|scene) + (1|ad) + (1|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts and slopes only varying by subject
DTM11<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1|scene) + (1|ad) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts and slopes only varying by scene and subject
DTM12<-lmer(IA_DWELL_TIME_PERCORRECTED~imagetype + (1+imagetype|scene) + (1|ad) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
# ii) Compare models using AICc values
# Compute AICc values for each model
AICc<-c(AICc(DTM0), AICc(DTM1), AICc(DTM2), AICc(DTM3), AICc(DTM4), AICc(DTM5), AICc(DTM6), AICc(DTM7), AICc(DTM8), AICc(DTM9), AICc(DTM10), AICc(DTM11), AICc(DTM12))
# Put values into one table for easy comparision
Model<-c("M0", "M1", "M2", "M3", "M4", "M5", "M6", "M7", "M8", "M9", "M10", "M11", "M12")
AICtable<-data.frame(DTModel=Model, DTAICc=AICc)
AICtable
## DTModel DTAICc
## 1 M0 -931.3028
## 2 M1 -928.5669
## 3 M2 -924.5117
## 4 M3 -927.6010
## 5 M4 -923.5286
## 6 M5 -1224.4389
## 7 M6 -1214.7732
## 8 M7 -1175.9760
## 9 M8 -1215.8943
## 10 M9 -1213.7084
## 11 M10 -1221.1809
## 12 M11 -1229.0776
## 13 M12 -1225.9118
#Let's do the same for number of fixations
# Linear model with no random effects
FCM0<-lm(IA_FIXATION_COUNT~imagetype,data=DataAD)
#What about just using counterbalance to combine across ad and scene?
FCM1<-lmer(IA_FIXATION_COUNT~imagetype + (1|counterbalanceversion) + (1|SUBID), data=DataAD, REML=FALSE)
FCM2<-lmer(IA_FIXATION_COUNT~imagetype + (1+imagetype|counterbalanceversion) + (1|SUBID), data=DataAD, REML=FALSE)
FCM3<-lmer(IA_FIXATION_COUNT~imagetype + (1|counterbalanceversion) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
FCM4<-lmer(IA_FIXATION_COUNT~imagetype + (1+imagetype|counterbalanceversion) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts
FCM5<-lmer(IA_FIXATION_COUNT~imagetype + (1|scene) + (1|ad) + (1|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts and slopes (This is no good!)
# No Scene, varying intercepts only
FCM6<-lmer(IA_FIXATION_COUNT~imagetype + (1|ad) + (1|SUBID), data=DataAD, REML=FALSE)
# No Ad, varying intercepts only
FCM7<-lmer(IA_FIXATION_COUNT~imagetype + (1|scene) + (1|SUBID), data=DataAD, REML=FALSE)
# No Subjects, varying intercepts only
FCM8<-lmer(IA_FIXATION_COUNT~imagetype + (1|scene) + (1|ad), data=DataAD, REML=FALSE)
# No Ad, varying intercepts and slopes
FCM9<-lmer(IA_FIXATION_COUNT~imagetype + (1+imagetype|scene) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts and slopes only varying by scene
FCM10<-lmer(IA_FIXATION_COUNT~imagetype + (1+imagetype|scene) + (1|ad) + (1|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts and slopes only varying by subject
FCM11<-lmer(IA_FIXATION_COUNT~imagetype + (1|scene) + (1|ad) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
# Full model with varying intercepts and slopes only varying by scene and subject
FCM12<-lmer(IA_FIXATION_COUNT~imagetype + (1+imagetype|scene) + (1|ad) + (1+imagetype|SUBID), data=DataAD, REML=FALSE)
# ii) Compare models using AICc values
# Compute AICc values for each model
AICc<-c(AICc(FCM0), AICc(FCM1), AICc(FCM2), AICc(FCM3), AICc(FCM4), AICc(FCM5), AICc(FCM6), AICc(FCM7), AICc(FCM8), AICc(FCM9), AICc(FCM10), AICc(FCM11), AICc(FCM12))
# Put values into one table for easy comparision
Model<-c("M0", "M1", "M2", "M3", "M4", "M5", "M6", "M7", "M8", "M9", "M10", "M11", "M12")
AICtable<-data.frame(FCModel=Model, FCAICc=AICc)
AICtable
## FCModel FCAICc
## 1 M0 4845.969
## 2 M1 4842.447
## 3 M2 4846.502
## 4 M3 4840.985
## 5 M4 4845.057
## 6 M5 4439.742
## 7 M6 4449.303
## 8 M7 4490.172
## 9 M8 4471.456
## 10 M9 4449.185
## 11 M10 4443.174
## 12 M11 4429.869
## 13 M12 4433.264
After more thought, counterbalance version may be the best fit for our model rather than modeling ad and scene type separately. DTM1 and DTM3 are really close to the lowest AICc value so these have the best model fit. FCM1 and FCM3 are also good fits for fixation count. Let’s look at them in more detail!
Note when you compare models they must be fit by Maximum Likelihood (ML) and not by Restricted Maximum Likelihood (REML). You may also want to compare BIC values for your models
Wonderful! Now we have found some well-fitting models lets take a peek at independence plot model fitted values vs residuals values:
E1 <- resid(DTM1)
F1<-fitted(DTM1)
plot(x = F1,
y = E1,
xlab = "Fitted Values DTM1",
ylab = "Normalized residuals DTM1")
abline(h = 0, lty = 2)
E1 <- resid(DTM3)
F1<-fitted(DTM3)
plot(x = F1,
y = E1,
xlab = "Fitted Values DTM3",
ylab = "Normalized residuals DTM3")
abline(h = 0, lty = 2)
E1 <- resid(FCM1)
F1<-fitted(FCM1 )
plot(x = F1,
y = E1,
xlab = "Fitted Values FCM1",
ylab = "Normalized residuals FCM1")
abline(h = 0, lty = 2)
E1 <- resid(FCM3)
F1<-fitted(FCM3)
plot(x = F1,
y = E1,
xlab = "Fitted Values FCM3",
ylab = "Normalized residuals FCM3")
abline(h = 0, lty = 2)
Notice that the Fixation count data is a bit striated, this is due to the fact that all values must be intergers.
Also notice that all graphs have heteroskedasticity issues due to a floor effect in the data.
To check the assumption of homogeneity let’s plot residuals vs each covariate in the model.
plot(x = DataAD$facimagetype,
y = E1,
xlab = "Image Type",
ylab = "Normalized residuals")
abline(h = 0, lty = 2)
# Counterbalance
boxplot(E1 ~ counterbalanceversion,
ylab = "Normalized residuals",
data = DataAD, xlab = "Scene")
abline(h = 0, lty = 2)
#Subject
boxplot(E1 ~ SUBID,
ylab = "Normalized residuals",
data = DataAD, xlab = "SUBID")
abline(h = 0, lty = 2)
These don’t look so bad, even though the floor effect is there.
It is also important to check the distribution of residual error. Normally distributed residuals indicate that your model predictions are not biased high or low:
hist(E1)
Perhaps we can do something to combat this, but perhaps not. Regardless lets look at the results of each model.
summary(DTM1)
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
## approximations to degrees of freedom [lmerMod]
## Formula:
## IA_DWELL_TIME_PERCORRECTED ~ imagetype + (1 | counterbalanceversion) +
## (1 | SUBID)
## Data: DataAD
##
## AIC BIC logLik deviance df.resid
## -928.6 -904.3 469.3 -938.6 947
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.2765 -0.7462 -0.2096 0.4117 5.9403
##
## Random effects:
## Groups Name Variance Std.Dev.
## SUBID (Intercept) 0.0002396 0.01548
## counterbalanceversion (Intercept) 0.0000000 0.00000
## Residual 0.0216342 0.14709
## Number of obs: 952, groups: SUBID, 34; counterbalanceversion, 2
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 0.171485 0.007246 103.500000 23.668 < 2e-16 ***
## imagetypeTAT -0.051443 0.009534 918.000000 -5.396 8.69e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## imagetypTAT -0.658
summary(DTM3)
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
## approximations to degrees of freedom [lmerMod]
## Formula:
## IA_DWELL_TIME_PERCORRECTED ~ imagetype + (1 | counterbalanceversion) +
## (1 + imagetype | SUBID)
## Data: DataAD
##
## AIC BIC logLik deviance df.resid
## -927.7 -893.7 470.9 -941.7 945
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.5334 -0.7786 -0.1982 0.4071 6.0198
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## SUBID (Intercept) 0.0008998 0.03000
## imagetypeTAT 0.0005565 0.02359 -1.00
## counterbalanceversion (Intercept) 0.0000000 0.00000
## Residual 0.0214034 0.14630
## Number of obs: 952, groups: SUBID, 34; counterbalanceversion, 2
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 0.171485 0.008452 34.720000 20.29 < 2e-16 ***
## imagetypeTAT -0.051443 0.010310 71.190000 -4.99 4.13e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## imagetypTAT -0.755
summary(FCM1)
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
## approximations to degrees of freedom [lmerMod]
## Formula: IA_FIXATION_COUNT ~ imagetype + (1 | counterbalanceversion) +
## (1 | SUBID)
## Data: DataAD
##
## AIC BIC logLik deviance df.resid
## 4842.4 4866.7 -2416.2 4832.4 947
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.4376 -0.7123 -0.1766 0.3907 5.9620
##
## Random effects:
## Groups Name Variance Std.Dev.
## SUBID (Intercept) 0.2786 0.5278
## counterbalanceversion (Intercept) 0.0000 0.0000
## Residual 9.1722 3.0286
## Number of obs: 952, groups: SUBID, 34; counterbalanceversion, 2
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 3.5294 0.1657 79.8000 21.297 < 2e-16 ***
## imagetypeTAT -0.9328 0.1963 918.0000 -4.751 2.34e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## imagetypTAT -0.592
summary(FCM3)
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
## approximations to degrees of freedom [lmerMod]
## Formula: IA_FIXATION_COUNT ~ imagetype + (1 | counterbalanceversion) +
## (1 + imagetype | SUBID)
## Data: DataAD
##
## AIC BIC logLik deviance df.resid
## 4840.9 4874.9 -2413.4 4826.9 945
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.7059 -0.7274 -0.1833 0.4119 5.9424
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## SUBID (Intercept) 0.6858 0.8281
## imagetypeTAT 0.2814 0.5305 -1.00
## counterbalanceversion (Intercept) 0.0000 0.0000
## Residual 9.0636 3.0106
## Number of obs: 952, groups: SUBID, 34; counterbalanceversion, 2
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 3.5294 0.1980 35.3200 17.823 < 2e-16 ***
## imagetypeTAT -0.9328 0.2153 80.9000 -4.332 4.21e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## imagetypTAT -0.750
All models are significant so a determination of which ACC fit makes more sense should be used to determine final model results. My sugestion would be to allow both models to vary by slope and intercept by both predictors (model 3) or have just the intercepts vary in both predictors(model 1). This way we can say that theoretically the variance in subject and counterbalance is the same across models where they should not differ.
One last thing- let’s make a DF of the means per subject to use in the combined analysis later.
DTaggdata <-aggregate(DataAD$IA_DWELL_TIME_PERCORRECTED, by=list(DataAD$numimagetype,DataAD$SUBID),
FUN=mean, na.rm=TRUE)
print(DTaggdata)
## Group.1 Group.2 x
## 1 1 1 0.12198413
## 2 2 1 0.08197171
## 3 1 2 0.14883077
## 4 2 2 0.08159989
## 5 1 3 0.23939642
## 6 2 3 0.10896748
## 7 1 5 0.18869858
## 8 2 5 0.14042786
## 9 1 6 0.10761006
## 10 2 6 0.09502446
## 11 1 8 0.20729742
## 12 2 8 0.12972673
## 13 1 9 0.14530253
## 14 2 9 0.19879541
## 15 1 10 0.21340089
## 16 2 10 0.18125713
## 17 1 11 0.14501741
## 18 2 11 0.10994640
## 19 1 12 0.23323606
## 20 2 12 0.11866725
## 21 1 14 0.17786531
## 22 2 14 0.11246920
## 23 1 15 0.31860680
## 24 2 15 0.11034648
## 25 1 16 0.16710946
## 26 2 16 0.14625539
## 27 1 17 0.21378201
## 28 2 17 0.10650205
## 29 1 18 0.12229575
## 30 2 18 0.10382075
## 31 1 19 0.15531267
## 32 2 19 0.11787954
## 33 1 20 0.10471750
## 34 2 20 0.10465575
## 35 1 22 0.23731551
## 36 2 22 0.18468002
## 37 1 24 0.16395596
## 38 2 24 0.13318788
## 39 1 25 0.16294718
## 40 2 25 0.11093522
## 41 1 27 0.13076341
## 42 2 27 0.08674490
## 43 1 28 0.23887032
## 44 2 28 0.11523520
## 45 1 29 0.19827336
## 46 2 29 0.12779866
## 47 1 30 0.15625231
## 48 2 30 0.10715036
## 49 1 32 0.10785662
## 50 2 32 0.11302180
## 51 1 35 0.16522266
## 52 2 35 0.16992282
## 53 1 36 0.12040043
## 54 2 36 0.12768641
## 55 1 37 0.16047409
## 56 2 37 0.08227407
## 57 1 39 0.17133867
## 58 2 39 0.09762460
## 59 1 40 0.22958294
## 60 2 40 0.10418866
## 61 1 41 0.08074497
## 62 2 41 0.12385691
## 63 1 43 0.14298332
## 64 2 43 0.07471174
## 65 1 44 0.17889840
## 66 2 44 0.08204926
## 67 1 46 0.17414837
## 68 2 46 0.19204107
FCaggdata <-aggregate(DataAD$IA_FIXATION_COUNT, by=list(DataAD$numimagetype,DataAD$SUBID),
FUN=mean, na.rm=TRUE)
print(FCaggdata)
## Group.1 Group.2 x
## 1 1 1 2.642857
## 2 2 1 1.785714
## 3 1 2 2.642857
## 4 2 2 1.857143
## 5 1 3 5.142857
## 6 2 3 2.785714
## 7 1 5 4.357143
## 8 2 5 3.142857
## 9 1 6 2.357143
## 10 2 6 1.928571
## 11 1 8 4.428571
## 12 2 8 2.714286
## 13 1 9 2.142857
## 14 2 9 3.000000
## 15 1 10 4.428571
## 16 2 10 3.714286
## 17 1 11 3.142857
## 18 2 11 2.857143
## 19 1 12 5.000000
## 20 2 12 2.500000
## 21 1 14 3.571429
## 22 2 14 2.000000
## 23 1 15 6.714286
## 24 2 15 3.000000
## 25 1 16 3.428571
## 26 2 16 2.857143
## 27 1 17 4.357143
## 28 2 17 2.571429
## 29 1 18 2.428571
## 30 2 18 2.214286
## 31 1 19 3.642857
## 32 2 19 2.428571
## 33 1 20 2.571429
## 34 2 20 2.571429
## 35 1 22 5.357143
## 36 2 22 3.785714
## 37 1 24 3.357143
## 38 2 24 2.500000
## 39 1 25 4.357143
## 40 2 25 3.642857
## 41 1 27 2.428571
## 42 2 27 1.928571
## 43 1 28 4.857143
## 44 2 28 2.571429
## 45 1 29 4.928571
## 46 2 29 3.785714
## 47 1 30 3.357143
## 48 2 30 2.357143
## 49 1 32 2.285714
## 50 2 32 2.357143
## 51 1 35 3.000000
## 52 2 35 3.500000
## 53 1 36 2.642857
## 54 2 36 3.000000
## 55 1 37 2.714286
## 56 2 37 1.500000
## 57 1 39 3.357143
## 58 2 39 1.928571
## 59 1 40 4.214286
## 60 2 40 2.714286
## 61 1 41 1.500000
## 62 2 41 1.714286
## 63 1 43 2.714286
## 64 2 43 1.928571
## 65 1 44 3.500000
## 66 2 44 1.857143
## 67 1 46 2.428571
## 68 2 46 3.285714
library(dplyr)
library(tidyr)
DTAggDataFAT<-filter(DTaggdata, DTaggdata$Group.1 == 1)
DTAggDataTAT<-filter(DTaggdata, DTaggdata$Group.1 == 2)
FCAggDataFAT<-filter(FCaggdata, FCaggdata$Group.1 == 1)
FCAggDataTAT<-filter(FCaggdata, FCaggdata$Group.1 == 2)
allaggFAT <- full_join(DTAggDataFAT,FCAggDataFAT, by = "Group.2")
allaggTAT <- full_join(DTAggDataTAT,FCAggDataTAT, by = "Group.2")
allagg <- full_join(allaggFAT,allaggTAT, by = "Group.2")
library(data.table)
## Warning: package 'data.table' was built under R version 3.2.5
## -------------------------------------------------------------------------
## data.table + dplyr code now lives in dtplyr.
## Please library(dtplyr)!
## -------------------------------------------------------------------------
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
##
## between, first, last
names <- c("FAT", "SUBID",'DTFAT',"FAT","FCFAT","TAT","DTTAT","TAT","FCTAT")
setnames(allagg, names)
allagg <- subset(allagg, select = c("SUBID",'DTFAT',"FCFAT","DTTAT","FCTAT"))
Now lets take a peek at the behavioral ratings. These ratings were collected via qualtrics. There are 3 types of ratings per image type; willingness to consume product, liking of product, and liking of ad. Averages per subject were caluclated across each rating.
First let’s load in the data
## Warning: package 'memisc' was built under R version 3.2.5
##
## Attaching package: 'memisc'
## The following object is masked from 'package:Matrix':
##
## as.array
## The following objects are masked from 'package:dplyr':
##
## collect, query, recode, rename
## The following objects are masked from 'package:Hmisc':
##
## %nin%, html
## The following objects are masked from 'package:stats':
##
## contr.sum, contr.treatment, contrasts
## The following object is masked from 'package:base':
##
## as.array
Let’s also add a column to indicate smoking behavior at session.
hist(RatingData$LikeAd.m.FAT)
hist(RatingData$LikeFood.m.FAT)
hist(RatingData$Consume.m.FAT)
hist(RatingData$LikeAd.m.TAT)
hist(RatingData$LikeFood.m.TAT)
hist(RatingData$Consume.m.TAT)
hist(RatingData$LikeAd.m.FOOD)
hist(RatingData$LikeFood.m.FOOD)
hist(RatingData$Consume.m.FOOD)
These histograms are expected. For all tat features involving flavor or consumtion, ratings are quite low!
First let’s look at correlations between FAT ratings
cor.test(RatingData$LikeAd.m.FAT, RatingData$LikeFood.m.FAT)
##
## Pearson's product-moment correlation
##
## data: RatingData$LikeAd.m.FAT and RatingData$LikeFood.m.FAT
## t = 3.9078, df = 32, p-value = 0.0004535
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2849814 0.7603884
## sample estimates:
## cor
## 0.5683725
r <- round(cor(RatingData$LikeAd.m.FAT, RatingData$LikeFood.m.FAT), 4)
r<- paste('r =',r)
p <- round(cor.test(RatingData$LikeAd.m.FAT, RatingData$LikeFood.m.FAT)$p.value, 4)
p <- paste('p =',p)
qplot(LikeAd.m.FAT, LikeFood.m.FAT, data = RatingData, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 5,2,label= r)+annotate('text', 5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(RatingData$Consume.m.FAT, RatingData$LikeFood.m.FAT)
##
## Pearson's product-moment correlation
##
## data: RatingData$Consume.m.FAT and RatingData$LikeFood.m.FAT
## t = -0.56654, df = 32, p-value = 0.575
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4235449 0.2468314
## sample estimates:
## cor
## -0.09965228
r <- round(cor(RatingData$Consume.m.FAT, RatingData$LikeFood.m.FAT), 4)
r<- paste('r =',r)
p <- round(cor.test(RatingData$Consume.m.FAT, RatingData$LikeFood.m.FAT)$p.value, 4)
p <- paste('p =',p)
qplot(Consume.m.FAT, LikeFood.m.FAT, data = RatingData, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 5,2,label= r)+annotate('text', 5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(RatingData$Consume.m.FAT, RatingData$LikeAd.m.FAT)
##
## Pearson's product-moment correlation
##
## data: RatingData$Consume.m.FAT and RatingData$LikeAd.m.FAT
## t = 0.26796, df = 32, p-value = 0.7904
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2955786 0.3794114
## sample estimates:
## cor
## 0.04731662
r <- round(cor(RatingData$Consume.m.FAT, RatingData$LikeAd.m.FAT), 4)
r<- paste('r =',r)
p <- round(cor.test(RatingData$Consume.m.FAT, RatingData$LikeAd.m.FAT)$p.value, 4)
p <- paste('p =',p)
qplot(Consume.m.FAT, LikeAd.m.FAT, data = RatingData, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 5,2,label= r)+annotate('text', 5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
It’s clear that any use (using either a cigarette or e-cig) is profoundly driving the use ratings.
Now let’s look at these same plots for food.
cor.test(RatingData$LikeAd.m.FOOD, RatingData$LikeFood.m.FOOD)
##
## Pearson's product-moment correlation
##
## data: RatingData$LikeAd.m.FOOD and RatingData$LikeFood.m.FOOD
## t = 6.1421, df = 32, p-value = 7.199e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.5289764 0.8598484
## sample estimates:
## cor
## 0.7355633
r <- round(cor(RatingData$LikeAd.m.FOOD, RatingData$LikeFood.m.FOOD), 4)
r<- paste('r =',r)
p <- round(cor.test(RatingData$LikeAd.m.FOOD, RatingData$LikeFood.m.FOOD)$p.value, 4)
p <- paste('p =',p)
qplot(LikeAd.m.FOOD, LikeFood.m.FOOD, data = RatingData, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 5,2,label= r)+annotate('text', 5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(RatingData$Consume.m.FOOD, RatingData$LikeFood.m.FOOD)
##
## Pearson's product-moment correlation
##
## data: RatingData$Consume.m.FOOD and RatingData$LikeFood.m.FOOD
## t = 6.6332, df = 32, p-value = 1.759e-07
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.5691685 0.8741316
## sample estimates:
## cor
## 0.7608843
r <- round(cor(RatingData$Consume.m.FOOD, RatingData$LikeFood.m.FOOD), 4)
r<- paste('r =',r)
p <- round(cor.test(RatingData$Consume.m.FOOD, RatingData$LikeFood.m.FOOD)$p.value, 4)
p <- paste('p =',p)
qplot(Consume.m.FOOD, LikeFood.m.FOOD, data = RatingData, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 5,2,label= r)+annotate('text', 5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(RatingData$Consume.m.FOOD, RatingData$LikeAd.m.FOOD)
##
## Pearson's product-moment correlation
##
## data: RatingData$Consume.m.FOOD and RatingData$LikeAd.m.FOOD
## t = 4.4336, df = 32, p-value = 0.0001022
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3521665 0.7901965
## sample estimates:
## cor
## 0.6168689
r <- round(cor(RatingData$Consume.m.FOOD, RatingData$LikeAd.m.FOOD), 4)
r<- paste('r =',r)
p <- round(cor.test(RatingData$Consume.m.FOOD, RatingData$LikeAd.m.FOOD)$p.value, 4)
p <- paste('p =',p)
qplot(Consume.m.FOOD, LikeAd.m.FOOD, data = RatingData, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 5,2,label= r)+annotate('text', 5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
Anyuse appears to be equally distributed, as it should be!
And now for TAT images
cor.test(RatingData$LikeAd.m.TAT, RatingData$LikeFood.m.TAT)
##
## Pearson's product-moment correlation
##
## data: RatingData$LikeAd.m.TAT and RatingData$LikeFood.m.TAT
## t = 2.8692, df = 32, p-value = 0.007231
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1348114 0.6856379
## sample estimates:
## cor
## 0.4523549
r <- round(cor(RatingData$LikeAd.m.TAT, RatingData$LikeFood.m.TAT), 4)
r<- paste('r =',r)
p <- round(cor.test(RatingData$LikeAd.m.TAT, RatingData$LikeFood.m.TAT)$p.value, 4)
p <- paste('p =',p)
qplot(LikeAd.m.TAT, LikeFood.m.TAT, data = RatingData, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 4.5,1.6,label= r)+annotate('text', 4.5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(RatingData$Consume.m.TAT, RatingData$LikeFood.m.TAT)
##
## Pearson's product-moment correlation
##
## data: RatingData$Consume.m.TAT and RatingData$LikeFood.m.TAT
## t = 11.011, df = 32, p-value = 2.039e-12
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.7885029 0.9437737
## sample estimates:
## cor
## 0.8894907
r <- round(cor(RatingData$Consume.m.TAT, RatingData$LikeFood.m.TAT), 4)
r<- paste('r =',r)
p <- round(cor.test(RatingData$Consume.m.TAT, RatingData$LikeFood.m.TAT)$p.value, 4)
p <- paste('p =',p)
qplot(Consume.m.TAT, LikeFood.m.TAT, data = RatingData, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 1.5,1.7,label= r)+annotate('text', 1.5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(RatingData$Consume.m.TAT, RatingData$LikeAd.m.TAT)
##
## Pearson's product-moment correlation
##
## data: RatingData$Consume.m.TAT and RatingData$LikeAd.m.TAT
## t = 2.9807, df = 32, p-value = 0.005457
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1519498 0.6947996
## sample estimates:
## cor
## 0.4661621
r <- round(cor(RatingData$Consume.m.TAT, RatingData$LikeAd.m.TAT), 4)
r<- paste('r =',r)
p <- round(cor.test(RatingData$Consume.m.TAT, RatingData$LikeAd.m.TAT)$p.value, 4)
p <- paste('p =',p)
qplot(Consume.m.TAT, LikeAd.m.TAT, data = RatingData, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 1.5,1.7,label= r)+annotate('text', 1.5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
Clearly some outliers in this data, specifically in the like food tat ratings. This only somewhat looks like it is driven by past use though.
library(foreign)
rawSurvey = read.spss("C:/Users/londeree.4/Dropbox/Lab-Londeree/osu-certs-pilot/experiment-surveys/Survey/SPSSSurveys.sav")
#Select columns of interest and save to a new data frame
SurveyAll <- data.frame(rawSurvey[c('SubID','everUse','cigIntent','ecigIntent', 'cigWill','ecigWill','FecigWill','cigProt', 'ecigProt','ecigAttitudes','aveSS')])
Survey <- SurveyAll[c(-4,-7,-13,-21,-23,-26,-31,-33,-34,-38,-42,-45),]
Now that everything is loaded in, let’s look at some descriptives.
describe(Survey)
## Survey
##
## 11 Variables 34 Observations
## ---------------------------------------------------------------------------
## SubID
## n missing distinct
## 34 0 34
##
## lowest : s01 s02 s03 s05 s06
## highest: s40 s41 s43 s44 s46
## ---------------------------------------------------------------------------
## everUse
## n missing distinct value
## 34 0 1 Not Used
##
## Value Not Used
## Frequency 34
## Proportion 1
## ---------------------------------------------------------------------------
## cigIntent
## n missing distinct Info Mean Gmd
## 34 0 3 0.242 1.088 0.1711
##
## Value 1.0 1.5 3.0
## Frequency 31 2 1
## Proportion 0.912 0.059 0.029
## ---------------------------------------------------------------------------
## ecigIntent
## n missing distinct Info Mean Gmd
## 34 0 4 0.716 1.25 0.3717
##
## Value 1.0 1.5 2.0 2.5
## Frequency 22 8 3 1
## Proportion 0.647 0.235 0.088 0.029
## ---------------------------------------------------------------------------
## cigWill
## n missing distinct Info Mean Gmd
## 34 0 4 0.64 1.206 0.3262
##
## Value 1.0 1.5 2.0 2.5
## Frequency 24 7 2 1
## Proportion 0.706 0.206 0.059 0.029
## ---------------------------------------------------------------------------
## ecigWill
## n missing distinct Info Mean Gmd
## 34 0 4 0.721 1.324 0.492
##
## Value 1.0 1.5 2.0 3.5
## Frequency 22 5 6 1
## Proportion 0.647 0.147 0.176 0.029
## ---------------------------------------------------------------------------
## FecigWill
## n missing distinct Info Mean Gmd
## 34 0 6 0.886 1.632 0.8102
##
## Value 1.0 1.5 2.0 2.5 3.0 4.0
## Frequency 16 5 6 4 2 1
## Proportion 0.471 0.147 0.176 0.118 0.059 0.029
## ---------------------------------------------------------------------------
## cigProt
## n missing distinct Info Mean Gmd .05 .10
## 34 0 25 0.997 3.088 1.45 1.465 1.630
## .25 .50 .75 .90 .95
## 2.200 2.900 3.775 4.630 5.170
##
## lowest : 1.1 1.4 1.5 1.6 1.7, highest: 4.0 4.9 5.1 5.3 8.4
## ---------------------------------------------------------------------------
## ecigProt
## n missing distinct Info Mean Gmd .05 .10
## 34 0 25 0.998 3.282 1.287 1.63 1.93
## .25 .50 .75 .90 .95
## 2.60 3.10 4.00 4.85 5.07
##
## lowest : 1.4 1.5 1.7 1.9 2.0, highest: 4.2 4.5 5.0 5.2 6.2
## ---------------------------------------------------------------------------
## ecigAttitudes
## n missing distinct Info Mean Gmd .05 .10
## 34 0 16 0.993 1.415 0.3207 1.033 1.050
## .25 .50 .75 .90 .95
## 1.200 1.400 1.600 1.735 1.852
##
## Value 1.00 1.05 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50
## Frequency 2 3 1 5 1 2 2 3 2 3
## Proportion 0.059 0.088 0.029 0.147 0.029 0.059 0.059 0.088 0.059 0.088
##
## Value 1.60 1.70 1.75 1.80 1.95 2.15
## Frequency 2 4 1 1 1 1
## Proportion 0.059 0.118 0.029 0.029 0.029 0.029
## ---------------------------------------------------------------------------
## aveSS
## n missing distinct Info Mean Gmd .05 .10
## 34 0 15 0.993 1.996 0.7288 1.206 1.250
## .25 .50 .75 .90 .95
## 1.500 1.875 2.625 2.750 3.087
##
## Value 1.125 1.250 1.375 1.500 1.625 1.750 1.875 2.000 2.125 2.250
## Frequency 2 3 3 2 3 3 4 1 1 1
## Proportion 0.059 0.088 0.088 0.059 0.088 0.088 0.118 0.029 0.029 0.029
##
## Value 2.375 2.625 2.750 3.000 3.250
## Frequency 1 3 4 1 2
## Proportion 0.029 0.088 0.118 0.029 0.059
## ---------------------------------------------------------------------------
Let’s look and see how are measures are related and distributed, but not care too much about thier correlations.
library(psych)
##
## Attaching package: 'psych'
## The following objects are masked from 'package:arm':
##
## logit, rescale, sim
## The following object is masked from 'package:Hmisc':
##
## describe
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
pairs.panels(Survey[sapply(Survey, is.numeric)], stars =T)
First let’s load in the data we want specifically, and look at a messy pairwise comparison.
alldata <- bind_cols(allagg,RatingData,Survey)
#alldata = alldata[c(-4,-7,-13,-21,-23,-26,-31,-33,-34,-38,-42,-45),]
write.csv(alldata, file = "C:/Users/londeree.4/Dropbox/Lab-Londeree/osu-certs-pilot/dissemination/papers/eye-tracking/alldata_nonsmokers.csv")
pairs.panels(alldata[c(2:5,7:9,13:15,19:21)], stars =T)
#pairs.panels(alldata[c(2:5,27:33)], stars =T)
#pairs.panels(alldata[c(7:9,13:15,19:21,27:31)], stars =T)
These charts are too much, but they give us a sense of what we are working with. Let’s take a closer look at the fat dwell time and fixation count correlations with behavioral ratings. Let’s also note the variace of these ratings.
FAT like ad SD
sd(alldata$LikeAd.m.FAT )
## [1] 1.17223
FAT like food SD
sd( alldata$LikeFood.m.FAT)
## [1] 1.516121
FAT consume SD
sd(alldata$Consume.m.FAT)
## [1] 0.4164966
cor.test(alldata$DTFAT, alldata$LikeFood.m.FAT)
##
## Pearson's product-moment correlation
##
## data: alldata$DTFAT and alldata$LikeFood.m.FAT
## t = 1.3407, df = 32, p-value = 0.1895
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1166513 0.5276300
## sample estimates:
## cor
## 0.2306115
r <- round(cor(alldata$DTFAT, alldata$LikeFood.m.FAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$DTFAT, alldata$LikeFood.m.FAT)$p.value, 4)
p <- paste('p =',p)
qplot(DTFAT, LikeFood.m.FAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', .3,2,label= r)+annotate('text', .3,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$DTFAT, alldata$LikeAd.m.FAT)
##
## Pearson's product-moment correlation
##
## data: alldata$DTFAT and alldata$LikeAd.m.FAT
## t = 1.4524, df = 32, p-value = 0.1561
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.09769838 0.54132565
## sample estimates:
## cor
## 0.2486834
r <- round(cor(alldata$DTFAT, alldata$LikeAd.m.FAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$DTFAT, alldata$LikeAd.m.FAT)$p.value, 4)
p <- paste('p =',p)
qplot(DTFAT, LikeAd.m.FAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', .3,2,label= r)+annotate('text', .3,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$DTFAT, alldata$Consume.m.FAT)
##
## Pearson's product-moment correlation
##
## data: alldata$DTFAT and alldata$Consume.m.FAT
## t = 1.037, df = 32, p-value = 0.3075
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1681116 0.4886742
## sample estimates:
## cor
## 0.1803043
r <- round(cor(alldata$DTFAT, alldata$Consume.m.FAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$DTFAT, alldata$Consume.m.FAT)$p.value, 4)
p <- paste('p =',p)
qplot(DTFAT, Consume.m.FAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', .3,2,label= r)+annotate('text', .3,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$FCFAT, alldata$LikeFood.m.FAT)
##
## Pearson's product-moment correlation
##
## data: alldata$FCFAT and alldata$LikeFood.m.FAT
## t = 1.514, df = 32, p-value = 0.1398
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.08725136 0.54873260
## sample estimates:
## cor
## 0.2585428
r <- round(cor(alldata$DTFAT, alldata$LikeFood.m.FAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$FCFAT, alldata$LikeFood.m.FAT)$p.value, 4)
p <- paste('p =',p)
qplot(FCFAT, LikeFood.m.FAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 5,2,label= r)+annotate('text', 5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$FCFAT, alldata$LikeAd.m.FAT)
##
## Pearson's product-moment correlation
##
## data: alldata$FCFAT and alldata$LikeAd.m.FAT
## t = 1.3444, df = 32, p-value = 0.1883
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1160167 0.5280940
## sample estimates:
## cor
## 0.2312205
r <- round(cor(alldata$FCFAT, alldata$LikeAd.m.FAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$FCFAT, alldata$LikeAd.m.FAT)$p.value, 4)
p <- paste('p =',p)
qplot(FCFAT, LikeAd.m.FAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 5,2,label= r)+annotate('text', 5,1.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$FCFAT, alldata$Consume.m.FAT)
##
## Pearson's product-moment correlation
##
## data: alldata$FCFAT and alldata$Consume.m.FAT
## t = 0.43441, df = 32, p-value = 0.6669
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.2685507 0.4042666
## sample estimates:
## cor
## 0.07656856
r <- round(cor(alldata$FCFAT, alldata$Consume.m.FAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$FCFAT, alldata$Consume.m.FAT)$p.value, 4)
p <- paste('p =',p)
qplot(FCFAT, Consume.m.FAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 5,4,label= r)+annotate('text', 5,4.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
From this we can see that the variance is good for all fat measures. Only the consume correlations seem to be particularly driven by users.
Now let’s take a look at the same for TAT, starting with and examination of variance.
The SD for TAT like ad
sd(alldata$LikeAd.m.TAT )
## [1] 1.002548
The SD for TAT like food
sd( alldata$LikeFood.m.TAT)
## [1] 0.07024935
The SD for TAT consume
sd(alldata$Consume.m.TAT)
## [1] 0.1507123
Now let’s look at the correlations and individually plot the relationships between eye-tracking measures and other measures.
cor.test(alldata$DTTAT, alldata$LikeFood.m.TAT)
##
## Pearson's product-moment correlation
##
## data: alldata$DTTAT and alldata$LikeFood.m.TAT
## t = 0.10867, df = 32, p-value = 0.9141
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3210438 0.3550666
## sample estimates:
## cor
## 0.01920712
r <- round(cor(alldata$DTTAT, alldata$LikeFood.m.TAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$DTTAT, alldata$LikeFood.m.TAT)$p.value, 4)
p <- paste('p =',p)
qplot(DTTAT, LikeFood.m.TAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', .15,1.5,label= r)+annotate('text', .15,1.3,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$DTTAT, alldata$LikeAd.m.TAT)
##
## Pearson's product-moment correlation
##
## data: alldata$DTTAT and alldata$LikeAd.m.TAT
## t = 2.0674, df = 32, p-value = 0.04687
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.005762012 0.610552609
## sample estimates:
## cor
## 0.3432589
r <- round(cor(alldata$DTTAT, alldata$LikeAd.m.TAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$DTTAT, alldata$LikeAd.m.TAT)$p.value, 4)
p <- paste('p =',p)
qplot(DTTAT, LikeAd.m.TAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', .15,1.5,label= r)+annotate('text', .15,1.3,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$DTTAT, alldata$Consume.m.TAT)
##
## Pearson's product-moment correlation
##
## data: alldata$DTTAT and alldata$Consume.m.TAT
## t = -0.050681, df = 32, p-value = 0.9599
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3460761 0.3302072
## sample estimates:
## cor
## -0.008958881
r <- round(cor(alldata$DTTAT, alldata$Consume.m.TAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$DTTAT, alldata$Consume.m.TAT)$p.value, 4)
p <- paste('p =',p)
qplot(DTTAT, Consume.m.TAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', .15,1.5,label= r)+annotate('text', .15,1.3,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$FCTAT, alldata$LikeFood.m.TAT)
##
## Pearson's product-moment correlation
##
## data: alldata$FCTAT and alldata$LikeFood.m.TAT
## t = 0.071792, df = 32, p-value = 0.9432
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3268783 0.3493566
## sample estimates:
## cor
## 0.01269011
r <- round(cor(alldata$DTTAT, alldata$LikeFood.m.TAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$FCTAT, alldata$LikeFood.m.TAT)$p.value, 4)
p <- paste('p =',p)
qplot(FCTAT, LikeFood.m.TAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 3,1.5,label= r)+annotate('text', 3,1.3,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$FCTAT, alldata$LikeAd.m.TAT)
##
## Pearson's product-moment correlation
##
## data: alldata$FCTAT and alldata$LikeAd.m.TAT
## t = 2.2028, df = 32, p-value = 0.03493
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.02815657 0.62441180
## sample estimates:
## cor
## 0.3628672
r <- round(cor(alldata$FCTAT, alldata$LikeAd.m.TAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$FCTAT, alldata$LikeAd.m.TAT)$p.value, 4)
p <- paste('p =',p)
qplot(FCTAT, LikeAd.m.TAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text',3,4,label= r)+annotate('text', 3,3.5,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
cor.test(alldata$FCTAT, alldata$Consume.m.TAT)
##
## Pearson's product-moment correlation
##
## data: alldata$FCTAT and alldata$Consume.m.TAT
## t = -0.19365, df = 32, p-value = 0.8477
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3681191 0.3075111
## sample estimates:
## cor
## -0.03421238
r <- round(cor(alldata$FCTAT, alldata$Consume.m.TAT), 4)
r<- paste('r =',r)
p <- round(cor.test(alldata$FCTAT, alldata$Consume.m.TAT)$p.value, 4)
p <- paste('p =',p)
qplot(FCTAT, Consume.m.TAT, data = alldata, main= "Rating Correlation") + geom_smooth(method = 'lm')+ annotate('text', 3,1.5,label= r)+annotate('text', 3,1.3,label= p)+geom_point(aes(colour= factor(demographics$AnyUse)))
From this we can see that the variance is quite low for tat like food and consume. Still the correlation between like ad is trending for dwell time and significant for fixation count.
t.test(alldata$LikeAd.m.FAT,alldata$LikeAd.m.TAT, paired = T)
##
## Paired t-test
##
## data: alldata$LikeAd.m.FAT and alldata$LikeAd.m.TAT
## t = 5.2361, df = 33, p-value = 9.194e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.7243403 1.6449407
## sample estimates:
## mean of the differences
## 1.184641