submitted by: Navona
due date: 2019-11-26
last ran: 2019-11-26
website: http://rpubs.com/navona/PSY2002_assignment04
Build a path model, using the variables Fluid, Verbal, and Social Intelligence to predict Love for Basketball.
a. How do you define the model?
First.Model <- 'Luv4Baskt ~ FluidInt + VerbInt + SocInt'
b. Run the model using sem()
. Draw the appropriate path model (outcome, predictors, predictor covariance, and outcome residual error) and label all paths with their standardized weights.
#use semPaths package for plotting
semPaths(q1.model,
what='std', #plot the standardize scores
rotation=2, #to look like class example
nCharNodes=0, #plot full variable name
sizeMan=12, #font of manifest variables
edge.color='black', #edge colour
edge.label.cex= 1, #edge text size
nDigits = 3, #number of floats
fade=FALSE, #edges aren't gradient colour
style='lisrel', #remove residual variances for predictors
curve=2, #increase angle of covariance curve
filetype='jpg') #write out as an image
#read sem image back in
sem_image <- readJPEG("qgraph.jpg") #read the image back in
#feed sem image into ggplot, and annotate
qplot(1:10, 1:10, geom="blank") + #define graph size
background_image(sem_image) + #set sem image as background
annotate("text", x = 9.4, y = 6, label = "error") + #add error
annotate("text", x = 7, y = 9, size = 7, parse = TRUE, label = as.character(expression(paste(chi^2, "(0, N=200) = 0.000, ", italic("p"), "<0.000")))) + #add statistics
theme(axis.text = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank())
Note: Square nodes indicate manifest variables. Directed edges indicate linear regression parameters (thicker lines indicate larger path coefficients). Dashed bidirectional edges indicate covariances, and assumes that the variables are exogenous. Residual variance (error) of the predicted variable is indicated with thick solid arrow.
c. Describe and interpret the regression analysis as you would for publication. Comment on which paths are significant and include the proportion of variance in Luv4Baskt
that is predicted by its predictors.
Our path model tested if fluid, verbal, and/or social intelligence predicted love for basketball. We found that FludInt
was not a significant indicator of love for basketball (p=.067), but VerbInt
(p=.004) and SocInt
(p=<.001) were. Specifically, FludInt
accounted for 1.2% of the variance, where as VerbInt
and SocInt
accounted for 4% and 10.3%, respectively.
We also examined the global model fit with chi-square (\(X^2\)), which measures of deviance between the model-implied covariance matrix, and the observed covariance matrix. The \(X^2\) statistic was highly significant (p<.001), indicating poor fit.
a. Construct the same model as for Question 1, but this time, constrain the co-variances between the MVs to be 0. What is the chi-square and p value for this model? Is this model a good fit for this data?
#define the model
Second.Model <- 'Luv4Baskt ~ FluidInt + VerbInt + SocInt
FluidInt ~~ 0*VerbInt
FluidInt ~~ 0*SocInt
VerbInt ~~ 0*SocInt'
#run the model
q2.model = sem(Second.Model, data = AD)
#summarize
summary(q2.model)
When we constrained the co-variances between the predictors, all three paths were significant (fluid intelligence p = .036; verbal p = .001, social p < .001). However, the model fit remains poor, \(X^2\)(3, N=200) = 107.362, p<.000.
Likewise, the \(X^2/df\) ratio (Joreskog, 1969), which provides a measure of badness-of-fit, is very large (35.787); a ratio value of <1 is a typical heuristic of good fit.
b. Look at the Modification Indices for this model. What is expected to reduce the chi-square value the most, and how much is the chi-square value predicted to change by?
modification_q2 <- modindices(q2.model)
modification_q2[modification_q2$op == "~~",] #sort to include only indices for covariance
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 4 FluidInt ~~ VerbInt 20.584 8.317 8.317 0.321 0.321
## 5 FluidInt ~~ SocInt 45.010 24.393 24.393 0.474 0.474
## 6 VerbInt ~~ SocInt 46.992 13.945 13.945 0.485 0.485
Modication indices provide a rough estimate of how well the \(X^2\) test statistic of a model would improve, if a particular parameter(s) were unconstrained, i.e., free to covary with others. If parameter pairs with a high index are freed, model fit may improve.
We see that freeing the variance between VerbInt
and SocInt
is expected to reduce \(X^2\) the most. Specifically, the \(X^2\) value is expected to reduce by an index of 46.992 (mi
), and it is expected that the standardized expected parameter change (sepc.all
) will take a value of .485.
c. How would you define a new model with the variance unconstrained using the information from question 2(b)? Run the new model using sem()
and give the new chi-square and p values.
#define new model
Second.Model.Mod <- 'Luv4Baskt ~ FluidInt + VerbInt + SocInt
VerbInt ~~ SocInt
FluidInt ~~ 0*VerbInt
FluidInt ~~ 0*SocInt'
#run new model
q2.model.mod = sem(Second.Model.Mod, data = AD)
#see test statistic
show(q2.model.mod)
To run a new model with variance unconstrained between VerbInt
and SocInt
, but leave the varaince contrained between the other two combinations of manifest variables, we code:
Second.Model.Mod <- 'Luv4Baskt ~ FluidInt + VerbInt + SocInt
VerbInt ~~ SocInt
FluidInt ~~ 0*VerbInt
FluidInt ~~ 0*SocInt'
The test statistic for the modified model, with freed covariance between VerbInt
and SocInt
, is \(X^2\)(2, N=200) = 53.796, p<.000. Thus, our model remains a poor fit to our data.
Now we want to build a model with two unobserved (latent) variables, each of which underlies 3 of the observed variables. Generate a model where Fluid, Verbal and Social Intelligence are indicators for Factor 1 (“Intelligence”), while Raptors Dedication (RapDed
), Psychology Today (PsyToday
) and Stock Market (StkMrkt
) are indicators for Factor 2 (“Long-term life satisfaction (LTLS)”). We also think that Intelligence will predict LTLS, so define a regression between LTLS and Intelligence in the model.
a. In a couple of sentences, explain what we are trying to test with this model (what is our hypothesis)?
Path analyses allow for a separate ‘measurement’ and ‘structural’ hypothesis. Our measurement hypothesis is that (i) fluid, verbal, and social intelligence are indicators of global intelligence, and that (ii) dedication to the Raptors, reading ‘Psychology Today’, and investing in the stock market are indicators of long-term life satisfacton. Our structural hypothesis is that long-term life satisfaction has a positive effect on global intelligence.
b. How do you define this new model?
Third.Model <- 'Intelligence =~ FluidInt + VerbInt + SocInt
LTLS =~ RapDed + PsyToday + StkMrkt
Intelligence ~ LTLS'
c. Run the model using sem(). What are the standardized coefficients for our MVs, and the beta value between our LVs?
The standardized coefficients for the manifest variables are as follows:
#make table of manifest coefficients
q3.model_pars[1:6, c(1:3, 11)]
## lhs op rhs std.all
## 1 Intelligence =~ FluidInt 0.557
## 2 Intelligence =~ VerbInt 0.605
## 3 Intelligence =~ SocInt 0.819
## 4 LTLS =~ RapDed 0.878
## 5 LTLS =~ PsyToday 0.822
## 6 LTLS =~ StkMrkt 0.880
The beta value between the latent variables (Intelligence
and LTLS
) is β=.651:
#beta value between latent variables
q3.model_pars[7, c(1:3, 11)]
## lhs op rhs std.all
## 7 Intelligence ~ LTLS 0.651
d. How good is the model at explaining our observed data? Give the chi-square and p-value.
\(X^2\)(8, N=200)=17.538, p=.025. Thus, the model fit remains poor (but, it is improving).
e. Look at the modification indices and discuss what changes you might make to the model to improve the fit.
Click to show table of modification indices:
#print indices
(modification_q3 <- modindices(q3.model))
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 16 Intelligence =~ RapDed 1.713 0.102 0.386 0.102 0.102
## 17 Intelligence =~ PsyToday 0.041 -0.022 -0.084 -0.016 -0.016
## 18 Intelligence =~ StkMrkt 1.257 -0.204 -0.774 -0.087 -0.087
## 19 LTLS =~ FluidInt 0.816 -0.211 -0.702 -0.103 -0.103
## 20 LTLS =~ VerbInt 1.708 0.179 0.598 0.157 0.157
## 21 LTLS =~ SocInt 0.203 -0.173 -0.577 -0.076 -0.076
## 22 FluidInt ~~ VerbInt 0.203 -0.722 -0.722 -0.042 -0.042
## 23 FluidInt ~~ SocInt 1.708 5.393 5.393 0.220 0.220
## 24 FluidInt ~~ RapDed 3.006 1.640 1.640 0.159 0.159
## 25 FluidInt ~~ PsyToday 2.281 2.100 2.100 0.127 0.127
## 26 FluidInt ~~ StkMrkt 12.891 -7.929 -7.929 -0.332 -0.332
## 27 VerbInt ~~ SocInt 0.816 -2.338 -2.338 -0.178 -0.178
## 28 VerbInt ~~ RapDed 0.182 0.221 0.221 0.040 0.040
## 29 VerbInt ~~ PsyToday 2.247 -1.139 -1.139 -0.128 -0.128
## 30 VerbInt ~~ StkMrkt 2.659 1.972 1.972 0.154 0.154
## 31 SocInt ~~ RapDed 0.032 -0.170 -0.170 -0.022 -0.022
## 32 SocInt ~~ PsyToday 0.009 -0.130 -0.130 -0.010 -0.010
## 33 SocInt ~~ StkMrkt 0.007 0.192 0.192 0.010 0.010
## 34 RapDed ~~ PsyToday 1.257 -1.236 -1.236 -0.232 -0.232
## 35 RapDed ~~ StkMrkt 0.041 -0.459 -0.459 -0.060 -0.060
## 36 PsyToday ~~ StkMrkt 1.713 3.397 3.397 0.274 0.274
Most modification indices (mi
) are relatively low, suggesting that our model wouldn’t change much by freeing covariance between pairs of variables. However, one pair has an mi
value above 10:
subset(modification_q3, mi > 10) #sort to examine `mi` values over 10
## lhs op rhs mi epc sepc.lv sepc.all sepc.nox
## 26 FluidInt ~~ StkMrkt 12.891 -7.929 -7.929 -0.332 -0.332
When we sort our table to mi
values > 10, we see that we may be able to improve the fit by freeing the covariance between FluidInt
and StkMrkt
.
Now let’s test whether a model holds similarly for two different groups. For this question, we are going to test whether a given model holds for both West coast and East coast Canadians. Generate a model with a single unobserved variable (Intelligence) predicting the three observed intelligence measures and Love for Basketball. Free the variance for the first MV, and constrain the variance in the LV ‘Intelligence’ to 1.
a. How do you define this model?
Fourth.Model <- 'Intelligence =~ NA*FluidInt + VerbInt + SocInt + Luv4Baskt
Intelligence ~~ 1*Intelligence'
b. Run the model using sem()
. Then, run the model with loadings and intercepts equal between groups. Calculate an ANOVA between the original model and the group-equal model. Is there a difference between groups?
#run the model
q4.model <- sem(Fourth.Model, data = AD, group = "Group")
#run a new model with loading and intercepts equal between groups
q4.model.eq <- sem(Fourth.Model, data = AD,group = "Group",
group.equal= c("loadings","intercepts"))
#calculate an ANOVA
anova(q4.model,q4.model.eq)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## q4.model 4 4252.8 4331.9 3.0732
## q4.model.eq 11 4240.7 4296.8 5.0531 1.9798 7 0.9609
The test statistic for this difference model is \(X^2_{diff}\)(7, N=200)=1.9798, p=.9609.
Thus, the full equality model suggests that there is not a difference between groups, i.e., there is no difference between the three predictors of Intelligence, between West coast and East coast Canadians.
c. Next, we have reason to think that Westerners and Easterners differ in the influence of Intelligence on Verbal and Social scores, but the two groups have the same relationship between Love for Basketball and Fluid Intelligence. Run a new model with the appropriate freed and fixed parameters between groups. Compare this partial equality model to the earlier full equality model.
#equality constraint over only some parameters
frees <- c("Intelligence =~ VerbInt", "Intelligence =~ SocInt")
#run model
q4.model.partial <- sem(Fourth.Model, data = AD,
group = "Group",
group.equal= c("loadings","intercepts"),
group.partial = frees)
#calculate ANOVA to compare between full equality model and partial fit model
anova(q4.model.eq, q4.model.partial)
## Chi-Squared Difference Test
##
## Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
## q4.model.partial 9 4244.5 4307.2 4.8350
## q4.model.eq 11 4240.7 4296.8 5.0531 0.21806 2 0.8967
The test statistic for this difference model is \(X^2_{diff}\)(2, N=200)=.21806, p=.8967.
The described partial equality model suggests that there is not a difference between groups, i.e., there is no difference between the three predictors of Intelligence, between West coast and East coast Canadians. This is the same result as the full equality model above.
d. In a few sentences, summarize and interpret your results for this model. Discuss what MVs contribute most significantly, and whether they differ between groups.
subset(q4_parEst, group == 1)
## lhs op rhs block group label est se z
## 1 Intelligence =~ FluidInt 1 1 .p1. 3.884 0.507 7.666
## 2 Intelligence =~ VerbInt 1 1 2.387 0.421 5.670
## 3 Intelligence =~ SocInt 1 1 6.278 0.805 7.795
## 4 Intelligence =~ Luv4Baskt 1 1 .p4. 0.609 0.074 8.255
## 5 Intelligence ~~ Intelligence 1 1 1.000 0.000 NA
## 6 FluidInt ~~ FluidInt 1 1 32.031 5.234 6.120
## 7 VerbInt ~~ VerbInt 1 1 10.702 1.828 5.853
## 8 SocInt ~~ SocInt 1 1 20.910 6.797 3.076
## 9 Luv4Baskt ~~ Luv4Baskt 1 1 0.622 0.107 5.823
## 10 FluidInt ~1 1 1 .p10. 28.732 0.582 49.400
## 11 VerbInt ~1 1 1 .p11. 24.649 0.332 74.227
## 12 SocInt ~1 1 1 .p12. 14.430 0.736 19.614
## 13 Luv4Baskt ~1 1 1 .p13. 0.015 0.087 0.168
## 14 Intelligence ~1 1 1 0.000 0.000 NA
## pvalue ci.lower ci.upper std.lv std.all std.nox
## 1 0.000 2.891 4.877 3.884 0.566 0.566
## 2 0.000 1.562 3.212 2.387 0.589 0.589
## 3 0.000 4.699 7.856 6.278 0.808 0.808
## 4 0.000 0.464 0.753 0.609 0.611 0.611
## 5 NA 1.000 1.000 1.000 1.000 1.000
## 6 0.000 21.774 42.289 32.031 0.680 0.680
## 7 0.000 7.118 14.286 10.702 0.653 0.653
## 8 0.002 7.589 34.232 20.910 0.347 0.347
## 9 0.000 0.412 0.831 0.622 0.627 0.627
## 10 0.000 27.592 29.872 28.732 4.186 4.186
## 11 0.000 23.998 25.299 24.649 6.086 6.086
## 12 0.000 12.988 15.872 14.430 1.858 1.858
## 13 0.866 -0.156 0.185 0.015 0.015 0.015
## 14 NA 0.000 0.000 0.000 0.000 0.000
subset(q4_parEst, group == 2)
## lhs op rhs block group label est se z
## 15 Intelligence =~ FluidInt 2 2 .p1. 3.884 0.507 7.666
## 16 Intelligence =~ VerbInt 2 2 2.224 0.365 6.099
## 17 Intelligence =~ SocInt 2 2 5.860 0.748 7.835
## 18 Intelligence =~ Luv4Baskt 2 2 .p4. 0.609 0.074 8.255
## 19 Intelligence ~~ Intelligence 2 2 1.000 0.000 NA
## 20 FluidInt ~~ FluidInt 2 2 30.419 4.998 6.087
## 21 VerbInt ~~ VerbInt 2 2 7.680 1.368 5.612
## 22 SocInt ~~ SocInt 2 2 19.524 5.848 3.338
## 23 Luv4Baskt ~~ Luv4Baskt 2 2 0.626 0.107 5.875
## 24 FluidInt ~1 2 2 .p10. 28.732 0.582 49.400
## 25 VerbInt ~1 2 2 .p11. 24.649 0.332 74.227
## 26 SocInt ~1 2 2 .p12. 14.430 0.736 19.614
## 27 Luv4Baskt ~1 2 2 .p13. 0.015 0.087 0.168
## 28 Intelligence ~1 2 2 -0.047 0.164 -0.285
## pvalue ci.lower ci.upper std.lv std.all std.nox
## 15 0.000 2.891 4.877 3.884 0.576 0.576
## 16 0.000 1.509 2.939 2.224 0.626 0.626
## 17 0.000 4.394 7.326 5.860 0.798 0.798
## 18 0.000 0.464 0.753 0.609 0.610 0.610
## 19 NA 1.000 1.000 1.000 1.000 1.000
## 20 0.000 20.624 40.214 30.419 0.668 0.668
## 21 0.000 4.998 10.362 7.680 0.608 0.608
## 22 0.001 8.061 30.986 19.524 0.362 0.362
## 23 0.000 0.417 0.835 0.626 0.628 0.628
## 24 0.000 27.592 29.872 28.732 4.259 4.259
## 25 0.000 23.998 25.299 24.649 6.937 6.937
## 26 0.000 12.988 15.872 14.430 1.966 1.966
## 27 0.866 -0.156 0.185 0.015 0.015 0.015
## 28 0.776 -0.369 0.275 -0.047 -0.047 -0.047
The partial equality model provides a good model fit. Across both groups, our test statistic is \(X^2\)(9, N=200)=4.835, p=.848.
The \(X^2\) test statistic for group 1 (Western Canadians) is 0.964. The order of the manifest variables’ contribution is SocInt
(β=0.808), followed by Luv4Baskt
(β=0.611), VerbInt
(β=0.589), and FluidInt
(β=0.566).
The \(X^2\) test statistic for group 2 (Eastern Canadians) is 3.871. A similar though not identical pattern exists regarding the order of the manifest variables’ contribution: Socint
(β=0.798), VerbInt
(β=0.626), Luv4Baskt
(β=0.610), FluidInt
(β=0.576).