Psychometrics HW #5

McDonald’s Correlated Uniqueness & MTMM

This homework assignment will be focused on running McDonald’s Correlated Uniqueness model to derive validity estimates (i.e., convergent, discriminant) & to examine the presence of common method bias. An example is also provided on how to conduct a more traditional Multi-Trait Multi-Method analysis using the ‘multicon’ package in R.

Packages Needed for this Assignment:

library(lavaan)
library(semTools)
library(multicon)
library(psych)
library(semPlot)

Part I

The file “HW6amod.csv” contains continuous propensity response data for 1000 individuals for two constructs.

Items 1-5 load only onto Factor 1 and Items 6-10 load only onto Factor 2. (6pts)

data.p1<- read.csv("HW6amod.csv", header=F)

1. Perform a confirmatory factor analysis of the two independent clusters, allowing the latent variables to correlate.

Remember the default in lavaan allows the factors to correlate
However, make sure you tell lavaan if you want to identify your model other than the default method (i.e., through the marker variable method)

Run the model

CFA.model<- 'F1 =~ V1 + V2 + V3 + V4 + V5
             F2 =~ V6 + V7 + V8 + V9 + V10'

cfa.q1<- cfa(CFA.model, data=data.p1, std.lv=TRUE)

Extract the results

summary(cfa.q1, standardized=T, fit.measures=T, rsquare=T)

## lavaan 0.6-3 ended normally after 21 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         21
## 
##   Number of observations                          1000
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                      97.619
##   Degrees of freedom                                34
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic             6075.203
##   Degrees of freedom                                45
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.989
##   Tucker-Lewis Index (TLI)                       0.986
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)             -11807.027
##   Loglikelihood unrestricted model (H1)     -11758.218
## 
##   Number of free parameters                         21
##   Akaike (AIC)                               23656.054
##   Bayesian (BIC)                             23759.117
##   Sample-size adjusted Bayesian (BIC)        23692.420
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.043
##   90 Percent Confidence Interval          0.033  0.053
##   P-value RMSEA <= 0.05                          0.857
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.026
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   F1 =~                                                                 
##     V1                0.872    0.029   30.440    0.000    0.872    0.815
##     V2                0.845    0.028   30.336    0.000    0.845    0.813
##     V3                0.804    0.030   26.722    0.000    0.804    0.745
##     V4                0.701    0.029   23.798    0.000    0.701    0.685
##     V5                0.887    0.027   32.888    0.000    0.887    0.857
##   F2 =~                                                                 
##     V6                0.856    0.028   30.423    0.000    0.856    0.814
##     V7                0.899    0.030   30.369    0.000    0.899    0.814
##     V8                0.838    0.030   28.292    0.000    0.838    0.775
##     V9                0.709    0.031   22.526    0.000    0.709    0.656
##     V10               0.914    0.028   32.996    0.000    0.914    0.858
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   F1 ~~                                                                 
##     F2                0.760    0.017   44.149    0.000    0.760    0.760
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .V1                0.384    0.022   17.770    0.000    0.384    0.335
##    .V2                0.366    0.021   17.833    0.000    0.366    0.338
##    .V3                0.517    0.027   19.489    0.000    0.517    0.444
##    .V4                0.556    0.027   20.344    0.000    0.556    0.531
##    .V5                0.284    0.018   15.938    0.000    0.284    0.265
##    .V6                0.372    0.021   17.865    0.000    0.372    0.337
##    .V7                0.413    0.023   17.897    0.000    0.413    0.338
##    .V8                0.466    0.025   18.938    0.000    0.466    0.399
##    .V9                0.664    0.032   20.663    0.000    0.664    0.569
##    .V10               0.299    0.019   15.960    0.000    0.299    0.263
##     F1                1.000                               1.000    1.000
##     F2                1.000                               1.000    1.000
## 
## R-Square:
##                    Estimate
##     V1                0.665
##     V2                0.662
##     V3                0.556
##     V4                0.469
##     V5                0.735
##     V6                0.663
##     V7                0.662
##     V8                0.601
##     V9                0.431
##     V10               0.737

Report the Chi-square value, df, its significance, and the CFI, and RMSEA.

Chi-square = 97.619
DF = 34
CFI = 0.989
RMSEA = .043 (P-value for RMSEA = .857)

2. Report the factor loading and item uniqueness information obtained from the CFA.

Remember, we can easily extract both of these using the ‘inspect’ command

Loadings (standardized)

inspect(cfa.q1, what="std")$lambda

##        F1    F2
## V1  0.815 0.000
## V2  0.813 0.000
## V3  0.745 0.000
## V4  0.685 0.000
## V5  0.857 0.000
## V6  0.000 0.814
## V7  0.000 0.814
## V8  0.000 0.775
## V9  0.000 0.656
## V10 0.000 0.858

Item uniqueness (error variance of the indicators)

inspect(cfa.q1, what="std")$theta

##     V1    V2    V3    V4    V5    V6    V7    V8    V9    V10  
## V1  0.335                                                      
## V2  0.000 0.338                                                
## V3  0.000 0.000 0.444                                          
## V4  0.000 0.000 0.000 0.531                                    
## V5  0.000 0.000 0.000 0.000 0.265                              
## V6  0.000 0.000 0.000 0.000 0.000 0.337                        
## V7  0.000 0.000 0.000 0.000 0.000 0.000 0.338                  
## V8  0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.399            
## V9  0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.569      
## V10 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.263

3. Calculate and report McDonald’s Omega for each factor. Also determine the internal consistency of the items for each factor. How do these alpha coefficients compare to the Omega values?

The package ‘SemTools’ works with lavaan and is handy for a variety of functions, including extracting McDonald’s Omega. Simply use the command ‘Reliability’ to do so.

You’ll also notice that this command gives you Cronbach’s Alpha as well! No need to run additional code like we previously did using the psych package. It also gives you ‘Omega2’ (which if we were running a bifactor or higher-order model, is equivalent to Omega Hierarchical)

reliability(cfa.q1)

##               F1        F2     total
## alpha  0.8874650 0.8870474 0.9193022
## omega  0.8890688 0.8892868 0.9338467
## omega2 0.8890688 0.8892868 0.9338467
## omega3 0.8894135 0.8907433 0.9315403
## avevar 0.6173671 0.6180819 0.6177336

Both F1 & F2 have identical Omega and Alpha coefficients
Omega is slightly higher than Alpha in both cases, but not by much.

##  
##   Internal Consistency   | F1            | F2     |
## |------------------------|:-------------:|:------:|
## | Alpha                  | .887          | .887   |
## | Omega                  | .889          | .889   |

4. Construct and present a factorial validity table of convergent and discriminant validity coefficients using McDonald’s Omega. Do these validities appear appropriately high or low?

Convergent validity here is the square root of the Omega coefficient for each respective factor
Discriminant validity here is the square root of the Omega coefficient multiplied by the correlation of the two factors (0.760)

# factor 1 (square root of omega)
sqrt(0.8890688) 
## [1] 0.9429044

# factor 2 (square root of omega)
sqrt(0.8892868) 
## [1] 0.94302

Now take the above values and multiply them by the correlation of the factors

#F1 
(0.9429044 * 0.76)
## [1] 0.7166073

#F2
(0.94302 * 0.76)
## [1] 0.7166952

When the numbers are rounded, you essentially get the same values for convergent and discriminant validity for both F1 & F2!

Although convergent validity is high for both factors (~.943), discriminant validity is quite poor (~.717). Ideally, we want discriminant validity to be much smaller.

##  
##                              | F1            | F2     |
## |----------------------------|:-------------:|:------:|
## | Y1 (convergent)            | 0.943         | 0.943  |
## | Y2 (discriminant)          | 0.717         | 0.717  |

5. Calculate and present the item information. For each factor, which item is the most informative?

For F1 – item V5 has the highest factor loading (0.887), the lowest residual value (.265), and the highest R² value (.735)
For F2 – item V10 has the highest factor loading (.914), lowest residual value (.263), and highest R² value (.737)

Part II

Three different traits (anxiety, depression, aggression) were measured by three different methods (Questionnaire, Interview, Observation).

Download the file, “HW6b.csv” & Perform a multi-trait multi-method analysis to examine construct validity based on the correlated uniqueness CFA approach. (4pts)

data.p2<-read.csv("HW6bb.csv", header=T) #read in without headers

Correlated Uniqueness CFA approach.

For this question, let’s first define each of our three factors. Remember, each trait will be defined by itself from the three different methods used.

Remember, we also want our residuals to correlate BETWEEN traits but WITHIN the same type of method used (e.g., correlate the residuals of the interview items for anxiety, aggression, and depression)
In lavaan, to correlate residuals, simply list the manifest variable of interest and then add ‘~~’ going to whatever other manifest variable you would like it to have correlated residuals with
- For example, if you want the residual variance of ‘Vbl1’ to correlate with ‘Vbl2’ and ‘Vbl3’, you can seperate ‘Vbl2’ and ‘Vbl3’ with a ‘+’ to make it easier
- So, in the model syntax it would look like: ‘Vbl1 ~~ Vbl2 + Vbl3’

MTMM.mod<- 'anxiety =~ anx_q + anx_i + anx_obs  #anxiety identified by the trait for the 3 methods
            depression =~ dep_q + dep_i + dep_obs
            aggression =~ agg_q + agg_i + agg_obs


#correlate residuals, respectively for each method
anx_q ~~ dep_q + agg_q     #allow anx/depress/aggress QUESTIONNAIRE indicators to have cor residuals
dep_q ~~ agg_q

dep_i ~~ anx_i + agg_i    #now do same for interview method for the 3 traits
anx_i ~~ agg_i

dep_obs ~~ anx_obs + agg_obs   #and finally the same for the Observation method
anx_obs ~~ agg_obs
'

Running the actual model (equivalent to typical CFA set-up - setting variance of factors to 1 and allowing them to correlate with each other)

MTMM.unique<- cfa(MTMM.mod, data=data.p2, std.lv=TRUE)
summary(MTMM.unique, fit.measures=T, standardized=T, rsquare=T)

## lavaan 0.6-3 ended normally after 26 iterations
## 
##   Optimization method                           NLMINB
##   Number of free parameters                         30
## 
##   Number of observations                           378
## 
##   Estimator                                         ML
##   Model Fit Test Statistic                      59.797
##   Degrees of freedom                                15
##   P-value (Chi-square)                           0.000
## 
## Model test baseline model:
## 
##   Minimum Function Test Statistic              984.348
##   Degrees of freedom                                36
##   P-value                                        0.000
## 
## User model versus baseline model:
## 
##   Comparative Fit Index (CFI)                    0.953
##   Tucker-Lewis Index (TLI)                       0.887
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -4400.850
##   Loglikelihood unrestricted model (H1)      -4370.951
## 
##   Number of free parameters                         30
##   Akaike (AIC)                                8861.699
##   Bayesian (BIC)                              8979.746
##   Sample-size adjusted Bayesian (BIC)         8884.563
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.089
##   90 Percent Confidence Interval          0.066  0.113
##   P-value RMSEA <= 0.05                          0.003
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.055
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Information saturated (h1) model          Structured
##   Standard Errors                             Standard
## 
## Latent Variables:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   anxiety =~                                                            
##     anx_q             0.672    0.049   13.761    0.000    0.672    0.721
##     anx_i             0.678    0.044   15.273    0.000    0.678    0.810
##     anx_obs           0.592    0.063    9.429    0.000    0.592    0.508
##   depression =~                                                         
##     dep_q             0.744    0.050   14.908    0.000    0.744    0.757
##     dep_i             0.715    0.047   15.311    0.000    0.715    0.782
##     dep_obs           0.571    0.053   10.796    0.000    0.571    0.568
##   aggression =~                                                         
##     agg_q             0.655    0.058   11.380    0.000    0.655    0.625
##     agg_i             0.748    0.069   10.866    0.000    0.748    0.594
##     agg_obs           0.833    0.057   14.544    0.000    0.833    0.815
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .anx_q ~~                                                              
##    .dep_q             0.025    0.031    0.793    0.428    0.025    0.060
##    .agg_q             0.007    0.034    0.209    0.835    0.007    0.013
##  .dep_q ~~                                                              
##    .agg_q             0.003    0.035    0.089    0.929    0.003    0.006
##  .anx_i ~~                                                              
##    .dep_i            -0.032    0.026   -1.238    0.216   -0.032   -0.115
##  .dep_i ~~                                                              
##    .agg_i             0.087    0.040    2.197    0.028    0.087    0.151
##  .anx_i ~~                                                              
##    .agg_i            -0.006    0.036   -0.163    0.871   -0.006   -0.012
##  .anx_obs ~~                                                            
##    .dep_obs           0.082    0.047    1.738    0.082    0.082    0.099
##  .dep_obs ~~                                                            
##    .agg_obs          -0.029    0.036   -0.801    0.423   -0.029   -0.060
##  .anx_obs ~~                                                            
##    .agg_obs          -0.040    0.043   -0.916    0.360   -0.040   -0.067
##   anxiety ~~                                                            
##     depression        0.675    0.047   14.218    0.000    0.675    0.675
##     aggression        0.430    0.060    7.179    0.000    0.430    0.430
##   depression ~~                                                         
##     aggression        0.486    0.057    8.540    0.000    0.486    0.486
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .anx_q             0.416    0.047    8.939    0.000    0.416    0.480
##    .anx_i             0.241    0.040    5.993    0.000    0.241    0.344
##    .anx_obs           1.009    0.081   12.467    0.000    1.009    0.742
##    .dep_q             0.412    0.049    8.415    0.000    0.412    0.426
##    .dep_i             0.325    0.044    7.452    0.000    0.325    0.389
##    .dep_obs           0.683    0.057   12.056    0.000    0.683    0.677
##    .agg_q             0.671    0.064   10.533    0.000    0.671    0.610
##    .agg_i             1.025    0.092   11.113    0.000    1.025    0.647
##    .agg_obs           0.350    0.068    5.150    0.000    0.350    0.335
##     anxiety           1.000                               1.000    1.000
##     depression        1.000                               1.000    1.000
##     aggression        1.000                               1.000    1.000
## 
## R-Square:
##                    Estimate
##     anx_q             0.520
##     anx_i             0.656
##     anx_obs           0.258
##     dep_q             0.574
##     dep_i             0.611
##     dep_obs           0.323
##     agg_q             0.390
##     agg_i             0.353
##     agg_obs           0.665

inspect(MTMM.unique, what="std")$theta

##         anx_q  anx_i  anx_bs dep_q  dep_i  dep_bs agg_q  agg_i  agg_bs
## anx_q    0.480                                                        
## anx_i    0.000  0.344                                                 
## anx_obs  0.000  0.000  0.742                                          
## dep_q    0.060  0.000  0.000  0.426                                   
## dep_i    0.000 -0.115  0.000  0.000  0.389                            
## dep_obs  0.000  0.000  0.099  0.000  0.000  0.677                     
## agg_q    0.013  0.000  0.000  0.006  0.000  0.000  0.610              
## agg_i    0.000 -0.012  0.000  0.000  0.151  0.000  0.000  0.647       
## agg_obs  0.000  0.000 -0.067  0.000  0.000 -0.060  0.000  0.000  0.335

Plotting the model:

semPaths(MTMM.unique, what="std")

Now Let’s Revisit the HW Questions….

1. Create a table like the one shown in lab slide.

Results from the above analysis have been put into table format below

2. Is there evidence of convergent validity? Explain.

Trait factor loadings that are large and statistically significant indicate good convergent validity.

In this case, anxiety, depression, and aggression all have loadings that range from ~.59 to ~.82, which are fairly high and indicative of good convergent validity. Put simply, the three methods used to measure each construct appear to cluster together well, and are theoretically good measurements of our underlying construct.

3. Is there evidence of discriminant validity? Explain.

Small correlations among the different trait factors indicate good discriminant validity.

Depression and aggression correlate at .486
Anxiety and depression correlate at .675
Anxiety and aggression correlate at .430
Overall, discriminant validity is not great for any of these traits, but is clearly (and unsurprisingly) the poorest for anxiety and depression

4. Is there evidence of “method effects”? Explain.

To investigate if common method effects are present, we want to look at our residual correlations.

For the questionnaire method - residuals are non-significant and quite small, suggesting no method effects are present
- The same is true for the Observation method reports
The interview ratings, however, are a bit trickier…it appears for traits like depression and aggression, there is some evidence for a method effect, though it is still small overall (.151)

Let’s compare this with results from the ‘MTMM’ command in the multicon package

This commmand requires the columns in the data to be ordered in sets such that the first set is the first trait rated by each method, followed by the second trait rated by each method.

#group items by trait 
methQanx<- data.p2[(1)]
methIanx<- data.p2[(4)]
methOanx<- data.p2[(7)]

methQdep<- data.p2[(2)]
methIdep<- data.p2[(5)]
methOdep<- data.p2[(8)]

methQagg<- data.p2[(3)]
methIagg<- data.p2[(6)]
methOagg<- data.p2[(9)]

#Merge the subsets back into one dataframe using the 'cbind.data.frame' command
mtmm.revise<-cbind.data.frame(methQanx, methIanx, methOanx, methQdep, 
                              methIdep, methOdep, methQagg, methIagg, methOagg)

Arguments for ‘MTMM’

‘x’ = A data.frame organized such that each column represents the ratings for each Trait-Method combination. The columns must be ordered in sets such that the first set is the first trait rated by each method, followed by the second trait rated by each method
‘traits’ = An integer indicating the total number of different traits rated
‘methods’ = An integer indicating the total number of methods used

MTMM(mtmm.revise, 3, 3) #this command is in multicon package

##         SameTrait SameMethod  DiffDiff
## Results 0.4582781  0.2748662 0.2750668

Interpreting the output from MTMM analysis

SameTrait = The average correlation for the Same Traits rated by Different Methods (convergent validity)
SameMethod = The average correlation for the Same Methods used to rate the Different Traits (method bias)
DiffDiff = The average correlation for the Different Traits rated by Different Methods (discriminant validity)

Our Results From the MTMM Command Suggest…

Same Trait is .46, suggesting that convergent validity is decent across the different traits, respectively
Same Method is .27, suggesting that there is some evidence for method bias, but this is still fairly small
DiffDiff is .28, which is also fairly low and implies that there is decent discriminant validity between different traits

Conclusions

Results from both McDonald’s Correlated Uniqueness Approach and using the Multi-Trait Multi Method command yielded some similarities, though more information can be extracted using McDonald’s method. Overall, it seems as though the model has good convergent validity and fair discriminant validity, though the results from the MTMM command indicated better discriminant validity than the results produced from McDonald’s method. The latter suggests that anxiety/depression/anxiety are still pretty correlated, and ideally we would want them to have a smaller association. However, our method bias was small in both examples, suggesting that only a small portion of variance is likely an artifact of the type of method used.