PSY 3750 — Well-Being Improvement

Read the data in.

library(xlsx)

## Loading required package: xlsxjars
## Loading required package: rJava

library(Hmisc)

## Loading required package: survival
## Loading required package: splines
## Loading required package: Formula
## Hmisc library by Frank E Harrell Jr
## 
## Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview')
## to see overall documentation.
## 
## 
## Attaching package: 'Hmisc'
## 
## The following object is masked from 'package:survival':
## 
##     untangle.specials
## 
## The following object is masked from 'package:base':
## 
##     format.pval, round.POSIXt, trunc.POSIXt, units

library(car)

## Warning: package 'car' was built under R version 3.0.2

## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:Hmisc':
## 
##     recode

library(psych)

## 
## Attaching package: 'psych'
## 
## The following object is masked from 'package:car':
## 
##     logit
## 
## The following object is masked from 'package:Hmisc':
## 
##     describe

## options(java.parameters = "-Xmx4000m")
setwd("~/Dropbox/service/friends/SasakiP")

wbd.raw  <- read.xlsx("PSY3750_pre_post.xls", sheetIndex=1, startRow=2, header=TRUE)
names(wbd.raw)

## [1] "ID"     "NA."    "Daily"  "Wkly"   "Pre"    "Post"   "Pre.1"  "Post.1"

wbd <- subset(wbd.raw, subset=(grepl("^[0-9]+$", ID)),
              select=c(Daily, Pre, Post, Pre.1, Post.1))
names(wbd) <- c("daily", "gq6.pre", "gq6.post", "swls.pre", "swls.post")
wbd$daily <- factor(ifelse(!is.na(wbd$daily), "daily", "weekly"))

There are a couple of ways to handle this. I like computing the pre-post differences and going from there. I can also do a 2 \(\times\) 2 mixed anova and get the same results. By computing the diffs, I need to look at the intercept. If I didn’t do the diffs, I’d have a pre-post diff effect. Let’s see.

wbd$gq6.post <- as.numeric(as.character(wbd$gq6.post))
wbd$swls.post <- as.numeric(as.character(wbd$swls.post))
wbd$gq6.diff <- wbd$gq6.post - wbd$gq6.pre
wbd$swls.diff <- wbd$swls.post - wbd$swls.pre

Exploratory Data Analysis

Okay, first looking at the subject-by-subject diffs. They look good, but it does not look like much is going on with daily versus weekly.

plot of chunk unnamed-chunk-3

Check the means and standard deviations.

describe(wbd)

##           var  n  mean   sd median trimmed  mad min max range  skew
## daily*      1 22  1.50 0.51    1.5    1.50 0.74   1   2     1  0.00
## gq6.pre     2 22 34.95 6.00   36.0   35.94 3.71  13  41    28 -2.17
## gq6.post    3 22 37.09 5.62   39.0   38.11 2.97  16  42    26 -2.42
## swls.pre    4 22 24.91 7.08   26.5   25.39 4.45   8  35    27 -0.74
## swls.post   5 22 27.00 5.32   27.5   27.72 3.71  11  35    24 -1.25
## gq6.diff    6 22  2.14 3.04    2.0    2.06 2.22  -3   9    12  0.30
## swls.diff   7 22  2.09 3.78    2.0    1.89 2.97  -4  11    15  0.47
##           kurtosis   se
## daily*       -2.09 0.11
## gq6.pre       5.44 1.28
## gq6.post      6.28 1.20
## swls.pre     -0.24 1.51
## swls.post     1.69 1.13
## gq6.diff     -0.36 0.65
## swls.diff    -0.36 0.81

summary(gq6.diff ~ daily, wbd, fun=smean.sd)

## gq6.diff    N=22
## 
## +-------+------+--+-----+-----+
## |       |      |N |Mean |SD   |
## +-------+------+--+-----+-----+
## |daily  |daily |11|2.364|3.668|
## |       |weekly|11|1.909|2.427|
## +-------+------+--+-----+-----+
## |Overall|      |22|2.136|3.044|
## +-------+------+--+-----+-----+

summary(swls.diff ~ daily, wbd, fun=smean.sd)

## swls.diff    N=22
## 
## +-------+------+--+-----+-----+
## |       |      |N |Mean |SD   |
## +-------+------+--+-----+-----+
## |daily  |daily |11|1.727|3.636|
## |       |weekly|11|2.455|4.059|
## +-------+------+--+-----+-----+
## |Overall|      |22|2.091|3.778|
## +-------+------+--+-----+-----+

Just in case, look at the pairwise scatter plots. plot of chunk unnamed-chunk-5

Take a look at the scatter plot between the diffs using the two measures.

plot of chunk unnamed-chunk-6

I am curious about the correlation between the measures. Looking at the plots above, it seems like the measures are a lot more similar for the daily than the weekly measures. I did not double check, but it looks like the large correlation may be diven largely by a single data point. Since I don’t really know what the measures are, even if it is a real effect, I am not sure why. It is hard to know what to make of the pattern given the number of students in the class.

with(wbd, cor.test(gq6.diff, swls.diff))

## 
##  Pearson's product-moment correlation
## 
## data:  gq6.diff and swls.diff
## t = 1.265, df = 20, p-value = 0.2205
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1689  0.6223
## sample estimates:
##    cor 
## 0.2721

with(subset(wbd, daily=="daily"), cor.test(gq6.diff, swls.diff))

## 
##  Pearson's product-moment correlation
## 
## data:  gq6.diff and swls.diff
## t = 1.928, df = 9, p-value = 0.08599
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.08775  0.86119
## sample estimates:
##    cor 
## 0.5406

with(subset(wbd, daily=="weekly"), cor.test(gq6.diff, swls.diff))

## 
##  Pearson's product-moment correlation
## 
## data:  gq6.diff and swls.diff
## t = -0.1386, df = 9, p-value = 0.8928
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.6286  0.5695
## sample estimates:
##      cor 
## -0.04614

Repeated Measures ANOVA

I don’t really know what to do with the two DVs You could combine them by taking the mean of the two diffs. You can check for a main effect of measure and an interaction. Do the two DVs seem to be getting at something different? A main effect of “subscale” would mean that one measure is influenced more by the manipulation, and an interaction would mean that one is influenced more in one condition than the other. Of course the daily vs. weekly main effect would mean that it matters how often you do the manipulation.

wbd.lm <- lm(as.matrix(wbd[,c("gq6.diff", "swls.diff")]) ~ daily, data=wbd)
idata <- data.frame(subscale=c("gq6", "swls"))

wbd.Anova.ii <- Anova(wbd.lm, idata=idata, idesign=~subscale, type="II")
summary(wbd.Anova.ii, multivariate=FALSE)

## 
## Univariate Type II Repeated-Measures ANOVA Assuming Sphericity
## 
##                   SS num Df Error SS den Df     F Pr(>F)   
## (Intercept)    196.6      1      313     20 12.57  0.002 **
## daily            0.2      1      313     20  0.01  0.910   
## subscale         0.0      1      178     20  0.00  0.960   
## daily:subscale   3.8      1      178     20  0.43  0.518   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The intercept effect indicates that the manipulation worked: that post was higher than pre. There was no evidence that the measures provided differential evidence for improvment, there was no evidence that it mattered whether the intervention was done daily or weekly, and there was no evidence that one measure was affected more by daily or weekly interventions. Because the \(N\) is so low failure to reject the null is at least as likely to be due to Type II error as it is its veracity.

I am not really happy with this analysis. It shows the same things as the analysis above, but it throws in a lot of information about the raw values of the scores that I cannot conceive of as being relevant. So, in this case, only things that deal with time examine the differences between pre and post. Those values of \(F\) and the correspo

pre.post <- 
  as.matrix(apply(wbd[,c("gq6.pre", "gq6.post", "swls.pre", "swls.post")], 2, as.numeric))
wbd.lm <- lm(pre.post ~ daily, data=wbd)
idata <- data.frame(subscale=rep(c("gq6", "swls"), each=2), time=c("pre", "post"))

wbd.Anova.ii <- Anova(wbd.lm, idata=idata, idesign=~subscale*time, type="II")
summary(wbd.Anova.ii, multivariate=FALSE)

## 
## Univariate Type II Repeated-Measures ANOVA Assuming Sphericity
## 
##                        SS num Df Error SS den Df      F Pr(>F)    
## (Intercept)         84506      1     2456     20 688.24 <2e-16 ***
## daily                   0      1     2456     20   0.00  0.992    
## subscale             2230      1      360     20 124.02  5e-10 ***
## daily:subscale          6      1      360     20   0.33  0.570    
## time                   98      1      156     20  12.57  0.002 ** 
## daily:time              0      1      156     20   0.01  0.910    
## subscale:time           0      1       89     20   0.00  0.960    
## daily:subscale:time     2      1       89     20   0.43  0.518    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

plot of chunk unnamed-chunk-10

This is the plot I think one would care about.

plot of chunk unnamed-chunk-11

Univariate ANOVAs

I would have stopped after the last analysis, but just in case the original intent was to look at each variable separately, I do that here. First GQ-6, which yields an improvement due to the manipulation.

summary(wbd.gq6.lm <- lm(gq6.diff ~ daily, data=wbd))

## 
## Call:
## lm(formula = gq6.diff ~ daily, data = wbd)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -5.364 -1.773  0.091  1.091  6.636 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)    2.364      0.938    2.52     0.02 *
## dailyweekly   -0.455      1.326   -0.34     0.74  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.11 on 20 degrees of freedom
## Multiple R-squared:  0.00584,    Adjusted R-squared:  -0.0439 
## F-statistic: 0.117 on 1 and 20 DF,  p-value: 0.735

Then the Satisfaction with Life Scale, which shows no change. Based on the repeated measures work above, I would say that there is not really a good statistical reason to look at the scales separately. That does not mean that there was not a good theoretical reason.

summary(wbd.swls.lm <- lm(swls.diff ~ daily, data=wbd))

## 
## Call:
## lm(formula = swls.diff ~ daily, data = wbd)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -5.727 -2.273 -0.455  1.477  8.545 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)    1.727      1.162    1.49     0.15
## dailyweekly    0.727      1.643    0.44     0.66
## 
## Residual standard error: 3.85 on 20 degrees of freedom
## Multiple R-squared:  0.0097, Adjusted R-squared:  -0.0398 
## F-statistic: 0.196 on 1 and 20 DF,  p-value: 0.663