Psychometrics week 6

Exercise 1

For the wave 1987, put data for the items of your primary test as well as a background variable you suspect to be related to test in a data frame. Be sure, though, that the categories of the background variable are ordinal. Start with the original data. When in doubt, use the data from the SPSS system file TOTAAL.SAV using function read.spss from the package foreign. Before making the the following exercises, check wether there are indeed miss- ing data in your data frame (is.na).

require(foreign)

set.seed(1)

original.data <- read.spss("TOTAAL.SAV")
load(file = "SI.RData")

neuroticism <- original.data[Neuroticism87]
age <- original.data[Age87]

data <- data.frame(age, neuroticism)
data <- as.data.frame(lapply(data, c))

# Check for na:
sapply(data, function(x) mean(is.na(x)))

##     M37    S227    S237    S238    S249    S259    S268    S278 
## 0.00000 0.00507 0.00507 0.00507 0.00507 0.00507 0.00507 0.00507

Exercise 2

Use cross-validation to determine optimal parameters for k Near- est Neighbors Imputation. Use the function cv.kNNImpute. Compute the imputed datamatrix with kNNImpute.

require(imputation)

cv.kNNImpute(as.matrix(data), k = 300)

## $k
## [1] 299
## 
## $rmse
## [1] 6.388
## 
## $k.full
##   [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
##  [18]  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34
##  [35]  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51
##  [52]  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68
##  [69]  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85
##  [86]  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100 101 102
## [103] 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119
## [120] 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136
## [137] 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153
## [154] 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170
## [171] 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187
## [188] 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204
## [205] 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221
## [222] 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238
## [239] 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255
## [256] 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272
## [273] 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289
## [290] 290 291 292 293 294 295 296 297 298 299 300
## 
## $rmse.full
##   [1] 6.567 6.485 6.458 6.439 6.429 6.425 6.421 6.418 6.414 6.409 6.406
##  [12] 6.404 6.404 6.402 6.401 6.400 6.400 6.399 6.399 6.399 6.399 6.397
##  [23] 6.397 6.397 6.397 6.397 6.397 6.396 6.396 6.395 6.395 6.395 6.394
##  [34] 6.394 6.394 6.394 6.394 6.393 6.393 6.393 6.393 6.393 6.393 6.393
##  [45] 6.393 6.393 6.393 6.393 6.392 6.392 6.392 6.392 6.392 6.392 6.392
##  [56] 6.392 6.392 6.392 6.392 6.392 6.392 6.392 6.392 6.392 6.392 6.392
##  [67] 6.392 6.392 6.392 6.392 6.392 6.392 6.391 6.391 6.391 6.391 6.392
##  [78] 6.392 6.392 6.392 6.392 6.391 6.391 6.391 6.391 6.391 6.391 6.391
##  [89] 6.391 6.391 6.391 6.391 6.391 6.391 6.391 6.391 6.391 6.391 6.391
## [100] 6.390 6.391 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390
## [111] 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390
## [122] 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390 6.390
## [133] 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389
## [144] 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389
## [155] 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389
## [166] 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389
## [177] 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389
## [188] 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389
## [199] 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389
## [210] 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389
## [221] 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389 6.389
## [232] 6.389 6.389 6.389 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388
## [243] 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388
## [254] 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388
## [265] 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388
## [276] 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388
## [287] 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388 6.388
## [298] 6.388 6.388 6.388
## 
## $nrmse.full
##   [1] 1.240 1.151 1.134 1.110 1.099 1.087 1.079 1.074 1.064 1.059 1.055
##  [12] 1.054 1.052 1.049 1.047 1.047 1.046 1.045 1.046 1.048 1.048 1.046
##  [23] 1.046 1.046 1.046 1.046 1.045 1.044 1.043 1.043 1.044 1.043 1.043
##  [34] 1.042 1.042 1.042 1.041 1.041 1.042 1.042 1.042 1.043 1.043 1.043
##  [45] 1.042 1.043 1.043 1.042 1.042 1.042 1.042 1.042 1.042 1.042 1.042
##  [56] 1.041 1.041 1.040 1.040 1.040 1.040 1.040 1.040 1.040 1.039 1.040
##  [67] 1.039 1.040 1.039 1.039 1.040 1.040 1.040 1.040 1.040 1.040 1.040
##  [78] 1.040 1.040 1.040 1.040 1.040 1.040 1.040 1.040 1.039 1.040 1.040
##  [89] 1.039 1.039 1.040 1.040 1.040 1.040 1.040 1.040 1.039 1.040 1.040
## [100] 1.039 1.040 1.039 1.039 1.039 1.040 1.039 1.039 1.039 1.039 1.039
## [111] 1.039 1.039 1.039 1.039 1.039 1.039 1.039 1.039 1.039 1.039 1.039
## [122] 1.039 1.039 1.039 1.039 1.039 1.039 1.039 1.039 1.039 1.039 1.039
## [133] 1.039 1.038 1.038 1.038 1.038 1.038 1.038 1.038 1.038 1.038 1.038
## [144] 1.038 1.038 1.038 1.038 1.038 1.038 1.038 1.038 1.038 1.038 1.038
## [155] 1.038 1.038 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037
## [166] 1.037 1.037 1.037 1.037 1.037 1.038 1.037 1.037 1.037 1.037 1.037
## [177] 1.037 1.037 1.037 1.037 1.037 1.037 1.038 1.038 1.037 1.037 1.037
## [188] 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037
## [199] 1.038 1.038 1.038 1.038 1.038 1.037 1.038 1.038 1.038 1.038 1.037
## [210] 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037
## [221] 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.037 1.036 1.036
## [232] 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036
## [243] 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036
## [254] 1.036 1.036 1.036 1.036 1.037 1.037 1.037 1.036 1.036 1.036 1.036
## [265] 1.037 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036
## [276] 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036
## [287] 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036 1.036
## [298] 1.036 1.036 1.036

result <- kNNImpute(as.matrix(data), k = 191)

## [1] "Computing distance matrix..."
## [1] "Distance matrix complete"
## [1] "Imputing row 82"
## [1] "Imputing row 92"
## [1] "Imputing row 142"
## [1] "Imputing row 234"
## [1] "Imputing row 344"
## [1] "Imputing row 375"
## [1] "Imputing row 385"
## [1] "Imputing row 528"
## [1] "Imputing row 531"

imputed <- result$x

Exercise 3

Do the same for imputation using singular value decomposition with cv.SVDImpute and SVDImpute.

cv.SVDImpute(as.matrix(data))

## $k
## [1] 2
## 
## $rmse
## [1] 1.904
## 
## $k.full
## [1] 1 2 3 4
## 
## $rmse.full
## [1] 2.825 1.904 1.988 2.133
## 
## $nrmse.full
## [1] 1.0036 0.8237 0.8649 0.9558

result.svd <- SVDImpute(as.matrix(data), k = 2)

## [1] "Running iteration 1"
## [1] "Running iteration 2"
## [1] "Running iteration 3"
## [1] "Running iteration 4"
## [1] "Running iteration 5"
## [1] "Running iteration 6"
## [1] "Running iteration 7"
## [1] "Running iteration 8"
## [1] "Running iteration 9"
## [1] "Running iteration 10"

imputed.svd <- result.svd$x

Exercise 4

Compare the covariance matrices for the kNN- and SVD-imputed data. Interpret the differences.

cov(imputed)

##           M37   S227   S237  S238    S249    S259   S268    S278
## M37  10.90630 0.2467 0.2217 0.103 -0.1574 0.04039 0.1269 0.07895
## S227  0.24673 4.0995 1.6700 1.845  1.3634 1.59811 1.6824 1.61581
## S237  0.22169 1.6700 3.5897 1.778  1.3332 1.46136 1.3452 1.20798
## S238  0.10298 1.8454 1.7783 3.798  1.7008 1.41416 2.6172 2.29460
## S249 -0.15735 1.3634 1.3332 1.701  3.1349 1.21478 1.5423 1.33469
## S259  0.04039 1.5981 1.4614 1.414  1.2148 3.64021 1.2106 1.13992
## S268  0.12692 1.6824 1.3452 2.617  1.5423 1.21063 3.5740 2.23259
## S278  0.07895 1.6158 1.2080 2.295  1.3347 1.13992 2.2326 3.32887

image(cov(imputed))

plot of chunk unnamed-chunk-6

cov(imputed.svd)

##           M37   S227   S237   S238    S249    S259   S268    S278
## M37  10.90630 0.2479 0.2249 0.1004 -0.1586 0.04344 0.1242 0.07659
## S227  0.24788 4.0995 1.6700 1.8453  1.3634 1.59809 1.6824 1.61573
## S237  0.22488 1.6700 3.5897 1.7781  1.3331 1.46140 1.3451 1.20786
## S238  0.10039 1.8453 1.7781 3.7984  1.7008 1.41399 2.6171 2.29458
## S249 -0.15859 1.3634 1.3331 1.7008  3.1348 1.21475 1.5422 1.33466
## S259  0.04344 1.5981 1.4614 1.4140  1.2148 3.64018 1.2105 1.13983
## S268  0.12415 1.6824 1.3451 2.6171  1.5422 1.21054 3.5739 2.23255
## S278  0.07659 1.6157 1.2079 2.2946  1.3347 1.13983 2.2325 3.32883

image(cov(imputed.svd))

plot of chunk unnamed-chunk-7

cov(imputed) - cov(imputed.svd)

##            M37       S227       S237      S238      S249       S259
## M37   0.000000 -1.151e-03 -3.195e-03 2.587e-03 1.235e-03 -3.047e-03
## S227 -0.001151  9.325e-05  2.303e-06 1.107e-04 5.338e-05  1.442e-05
## S237 -0.003195  2.303e-06 -4.493e-05 2.128e-04 7.733e-05 -4.315e-05
## S238  0.002587  1.107e-04  2.128e-04 5.002e-05 9.625e-06  1.617e-04
## S249  0.001235  5.338e-05  7.733e-05 9.625e-06 9.004e-05  2.454e-05
## S259 -0.003047  1.442e-05 -4.315e-05 1.617e-04 2.454e-05  2.814e-05
## S268  0.002769  8.539e-05  1.278e-04 4.283e-05 2.353e-05  8.949e-05
## S278  0.002358  8.343e-05  1.223e-04 2.555e-05 3.092e-05  8.491e-05
##           S268      S278
## M37  2.769e-03 2.358e-03
## S227 8.539e-05 8.343e-05
## S237 1.278e-04 1.223e-04
## S238 4.283e-05 2.555e-05
## S249 2.353e-05 3.092e-05
## S259 8.949e-05 8.491e-05
## S268 1.022e-04 4.254e-05
## S278 4.254e-05 3.798e-05

image(cov(imputed) - cov(imputed.svd))

plot of chunk unnamed-chunk-8

We see that the covariance is very different between age and (some of) the other variables. This is expected: all of the rows with any NA values have NA in all columns except for gender. That means that everything is filled in similar to existing data, so with similar inter-variable correlations.

The difference between the methods is which other “data points” they choose those variables from; specifically, what ages those points have exactly. So the covariance will mostly be different between the age and the filled in group.

Exercise 5

Estimate the covariance matrix from the raw data matrix using full-information maximum-likelihood.

require(OpenMx)

selVars <- c(Age87, Neuroticism87)
testData <- data

ssModel <- mxModel("Estimation of Gaussian sufficient statistics ", mxMatrix(type = "Full", 
    nrow = 1, ncol = length(selVars), free = TRUE, values = rep(0, length(selVars)), 
    name = "expMean"), mxMatrix(type = "Lower", nrow = length(selVars), ncol = length(selVars), 
    free = TRUE, values = 0.5, name = "Chol"), mxAlgebra(expression = Chol %*% 
    t(Chol), name = "expCov", ), mxData(observed = testData, type = "raw"), 
    mxFIMLObjective(covariance = "expCov", means = "expMean", dimnames = selVars))
(covarianceMatrix <- mxRun(ssModel)$expCov@result)

## Running Estimation of Gaussian sufficient statistics

##         [,1]   [,2]   [,3]   [,4]    [,5]   [,6]   [,7]   [,8]
## [1,] 10.9001 0.2479 0.2226 0.1033 -0.1592 0.0416 0.1264 0.0784
## [2,]  0.2479 4.1180 1.6774 1.8537  1.3695 1.6053 1.6900 1.6231
## [3,]  0.2226 1.6774 3.6058 1.7863  1.3392 1.4679 1.3512 1.2134
## [4,]  0.1033 1.8537 1.7863 3.8155  1.7084 1.4205 2.6289 2.3049
## [5,] -0.1592 1.3695 1.3392 1.7084  3.1490 1.2203 1.5492 1.3407
## [6,]  0.0416 1.6053 1.4679 1.4205  1.2203 3.6566 1.2161 1.1450
## [7,]  0.1264 1.6900 1.3512 2.6289  1.5492 1.2161 3.5900 2.2426
## [8,]  0.0784 1.6231 1.2134 2.3049  1.3407 1.1450 2.2426 3.3439

image(covarianceMatrix)

plot of chunk unnamed-chunk-10

Comparisons:

cov(imputed) - covarianceMatrix

##             M37      S227       S237       S238      S249      S259
## M37   0.0061725 -0.001176 -0.0009546 -0.0002851  0.001828 -0.001213
## S227 -0.0011759 -0.018446 -0.0074772 -0.0082812 -0.006107 -0.007159
## S237 -0.0009546 -0.007477 -0.0160999 -0.0079520 -0.005956 -0.006540
## S238 -0.0002851 -0.008281 -0.0079520 -0.0170458 -0.007678 -0.006299
## S249  0.0018281 -0.006107 -0.0059560 -0.0076778 -0.014105 -0.005476
## S259 -0.0012131 -0.007159 -0.0065399 -0.0062989 -0.005476 -0.016363
## S268  0.0005400 -0.007556 -0.0060356 -0.0117300 -0.006954 -0.005421
## S278  0.0005441 -0.007254 -0.0054171 -0.0103052 -0.005999 -0.005109
##           S268       S278
## M37   0.000540  0.0005441
## S227 -0.007556 -0.0072535
## S237 -0.006036 -0.0054171
## S238 -0.011730 -0.0103052
## S249 -0.006954 -0.0059989
## S259 -0.005421 -0.0051092
## S268 -0.016034 -0.0100200
## S278 -0.010020 -0.0149985

image(cov(imputed) - covarianceMatrix)

plot of chunk unnamed-chunk-11

cov(imputed.svd) - covarianceMatrix

##             M37       S227      S237      S238       S249      S259
## M37   6.173e-03 -2.482e-05  0.002240 -0.002873  0.0005931  0.001834
## S227 -2.482e-05 -1.854e-02 -0.007479 -0.008392 -0.0061606 -0.007174
## S237  2.240e-03 -7.479e-03 -0.016055 -0.008165 -0.0060334 -0.006497
## S238 -2.873e-03 -8.392e-03 -0.008165 -0.017096 -0.0076875 -0.006461
## S249  5.931e-04 -6.161e-03 -0.006033 -0.007687 -0.0141945 -0.005501
## S259  1.834e-03 -7.174e-03 -0.006497 -0.006461 -0.0055010 -0.016391
## S268 -2.229e-03 -7.642e-03 -0.006163 -0.011773 -0.0069779 -0.005511
## S278 -1.814e-03 -7.337e-03 -0.005539 -0.010331 -0.0060299 -0.005194
##           S268      S278
## M37  -0.002229 -0.001814
## S227 -0.007642 -0.007337
## S237 -0.006163 -0.005539
## S238 -0.011773 -0.010331
## S249 -0.006978 -0.006030
## S259 -0.005511 -0.005194
## S268 -0.016136 -0.010063
## S278 -0.010063 -0.015037

image(cov(imputed.svd) - covarianceMatrix)

plot of chunk unnamed-chunk-12

Since this method does not take a linear combination of other rows to fill our unknown rows, the within-variable covariances (i.e. everything but age) are now also changed in a variety of ways.

Exercise 6: Polychoric correlations

Compute both the pmc correlations and the polychoric correla- tions. Use package polycor. The function hetcor computes the appropriate correlation depending on the class of the variable. [..] Where do differences occur?

require(polycor)

## Loading required package: polycor Loading required package: mvtnorm
## Loading required package: sfsmisc

new.data <- na.omit(data)
xo <- lapply(new.data, ordered)
pcor <- hetcor(xo, ML = FALSE, std.err = TRUE, bins = 4, pd = TRUE)
pmc <- cor(as.matrix(new.data))

require(psych)

## Loading required package: psych
## 
## Attaching package: 'psych'
## 
## The following object is masked from 'package:polycor':
## 
## polyserial
## 
## The following object is masked from 'package:OpenMx':
## 
## tr

cor.plot(pcor)

plot of chunk unnamed-chunk-13

cor.plot(pmc)

plot of chunk unnamed-chunk-13

cor.plot(pmc - pcor$correlations)

plot of chunk unnamed-chunk-13

Differences occur everywhere but in the correlations with age.

Exercise 7

Plot the pmc against the polychoric correlations and discuss the result.

plot(c(pmc), c(pcor$correlations))

plot of chunk unnamed-chunk-14

We see that the polychoric correlations are generally slightly higher. This is likely because the underlying structure is captured with that method, so the analysis is more “precise”, getting closer to the true correlations.

Exercise 8

Make a 3D plot of the pmc correlations, say y, against the means of the variables, say x 1 and x 2 . Is there an interpretable pattern?

y <- c(pmc)
means <- lapply(new.data, mean)
x1 <- means[rep(rownames(pcor$correlations), times = 8, each = 1)]
x2 <- means[rep(rownames(pcor$correlations), times = 1, each = 8)]

scatterplot3d(x1, x2, y, pch = 20, type = "h")

## Error: could not find function "scatterplot3d"

There's a pattern: when both means are similar (i.e. high or low), the correlation is positive, and when the means are dissimilar, the correlation is close to 0. But not negative! That is likely because the variables attempt to measure the same latent variable.

Exercise 9

Make a 3D plot of the polychoric correlations, say y, against the means of the variables, say x 1 and x 2 . Is there an interpretable pattern?

require(scatterplot3d)

## Loading required package: scatterplot3d

y <- c(pcor$correlations)
means <- lapply(new.data, mean)
x1 <- means[rep(rownames(pcor$correlations), times = 8, each = 1)]
x2 <- means[rep(rownames(pcor$correlations), times = 1, each = 8)]

scatterplot3d(x1, x2, y, pch = 20, type = "h")

plot of chunk unnamed-chunk-16

The pattern is the same as in exercise 8.

Exercise 10

Fit a sem model assuming that the items are indicators of a single factor and the factor regresses on the background variable. Use both pmcs en polychorics. Compare goodness of fit and parameterestimates. Is there a difference? If yes, what?

PMC

require(semPlot)

## Loading required package: semPlot This is semPlot 0.3.3 semPlot is BETA
## software! Please report any bugs.

newModel2 <- mxModel("Model ", 
                    type="RAM",
                   mxData(observed=pmc, type="cor", numObs=1766),
                   manifestVars = c(Age87, Neuroticism87),
                   latentVars = "Factor",
                   # residual variances
                   mxPath(from=c(Age87, Neuroticism87),
                         arrows=2,
                         free=TRUE,
                         values=c(1,1,1,1,1,1,1,1),
                         labels=c("e0","e1","e2","e3","e4","e5","e6","e7")
                   ),
                                     # latent variance
                  mxPath(from="Factor",
                         arrows=2,
                         free=FALSE,
                         values=1,
                         labels ="varFactor"
                  ),
                  # factor loadings
                  mxPath(from="Factor",
                         to=Neuroticism87,
                         arrows=1,
                         free=c(TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE),
                         values=c(1,1,1,1,1,1,1),
                         labels =c("l1","l2","l3","l4","l5","l6","l7")
                  ),

                  # factor loadings
                  mxPath(to="Factor",
                         from=Age87,
                         arrows=1,
                         free=c(TRUE),
                         values=c(1),
                         labels =c("l0")
                  )
          )
fit2 <- mxRun(newModel2)

## Running Model

semPaths(newModel2)

plot of chunk unnamed-chunk-17

summary(fit2)

## data:
## $`Model .data`
## $`Model .data`$cor
##            M37   S227    S237    S238     S249     S259    S268    S278
## M37   1.000000 0.0370 0.03551 0.01602 -0.02715 0.006596 0.02021 0.01299
## S227  0.036995 1.0000 0.43532 0.46766  0.38032 0.413687 0.43954 0.43739
## S237  0.035506 0.4353 1.00000 0.48159  0.39742 0.404259 0.37556 0.34945
## S238  0.016016 0.4677 0.48159 1.00000  0.49287 0.380294 0.71032 0.64529
## S249 -0.027152 0.3803 0.39742 0.49287  1.00000 0.359605 0.46076 0.41316
## S259  0.006596 0.4137 0.40426 0.38029  0.35961 1.000000 0.33564 0.32746
## S268  0.020206 0.4395 0.37556 0.71032  0.46076 0.335639 1.00000 0.64727
## S278  0.012995 0.4374 0.34945 0.64529  0.41316 0.327460 0.64727 1.00000
## 
## 
## free parameters:
##    name matrix    row    col Estimate Std.Error Std.Estimate  Std.SE
## 1    l0      A Factor    M37  0.02096   0.02531      0.02096 0.02531
## 2    l1      A   S227 Factor  0.58823   0.02298      0.58836 0.02298
## 3    l2      A   S237 Factor  0.55319   0.02328      0.55331 0.02329
## 4    l3      A   S238 Factor  0.85374   0.02003      0.85393 0.02004
## 5    l4      A   S249 Factor  0.59139   0.02289      0.59152 0.02289
## 6    l5      A   S259 Factor  0.48365   0.02381      0.48375 0.02382
## 7    l6      A   S268 Factor  0.80680   0.02059      0.80698 0.02059
## 8    l7      A   S278 Factor  0.75088   0.02125      0.75104 0.02125
## 9    e0      S    M37    M37  1.00000   0.03366      1.00000 0.03366
## 10   e1      S   S227   S227  0.65383   0.02382      0.65383 0.02382
## 11   e2      S   S237   S237  0.69385   0.02494      0.69385 0.02494
## 12   e3      S   S238   S238  0.27081   0.01429      0.27081 0.01429
## 13   e4      S   S249   S249  0.65011   0.02362      0.65011 0.02362
## 14   e5      S   S259   S259  0.76598   0.02697      0.76598 0.02697
## 15   e6      S   S268   S268  0.34879   0.01573      0.34879 0.01573
## 16   e7      S   S278   S278  0.43593   0.01780      0.43593 0.01780
##    lbound ubound
## 1               
## 2               
## 3               
## 4               
## 5               
## 6               
## 7               
## 8               
## 9               
## 10              
## 11              
## 12              
## 13              
## 14              
## 15              
## 16              
## 
## observed statistics:  28 
## estimated parameters:  16 
## degrees of freedom:  12 
## -2 log likelihood:  9675 
## saturated -2 log likelihood:  9352 
## number of observations:  1766 
## chi-square:  323.6 
## p:  5.167e-62 
## Information Criteria: 
##      df Penalty Parameters Penalty Sample-Size Adjusted
## AIC:         NA                 NA                   NA
## BIC:         NA                 NA                   NA
## CFI: 0.936 
## TLI: 0.9103 
## RMSEA:  0.09271 
## timestamp: 2013-10-14 21:29:17 
## frontend time: 0.03783 secs 
## backend time: 0.003153 secs 
## independent submodels time: 1.86e-05 secs 
## wall clock time: 0.041 secs 
## cpu time: 0.041 secs 
## openmx version number: 1.3.2-2301

Polychoric

require(semPlot)
newModel2 <- mxModel("Model ", 
                    type="RAM",
                   mxData(observed=pcor$correlations, type="cor", numObs=1766),
                   manifestVars = c(Age87, Neuroticism87),
                   latentVars = "Factor",
                   # residual variances
                   mxPath(from=c(Age87, Neuroticism87),
                         arrows=2,
                         free=TRUE,
                         values=c(1,1,1,1,1,1,1,1),
                         labels=c("e0","e1","e2","e3","e4","e5","e6","e7")
                   ),
                                     # latent variance
                  mxPath(from="Factor",
                         arrows=2,
                         free=FALSE,
                         values=1,
                         labels ="varFactor"
                  ),
                  # factor loadings
                  mxPath(from="Factor",
                         to=Neuroticism87,
                         arrows=1,
                         free=c(TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE),
                         values=c(1,1,1,1,1,1,1),
                         labels =c("l1","l2","l3","l4","l5","l6","l7")
                  ),

                  # factor loadings
                  mxPath(to="Factor",
                         from=Age87,
                         arrows=1,
                         free=c(TRUE),
                         values=c(1),
                         labels =c("l0")
                  )
          )
fit2 <- mxRun(newModel2)

## Running Model

# semPaths(newModel2)
summary(fit2)

## data:
## $`Model .data`
## $`Model .data`$cor
##           M37    S227    S237    S238     S249    S259    S268    S278
## M37   1.00000 0.04584 0.04312 0.02168 -0.02928 0.01932 0.01764 0.01945
## S227  0.04584 1.00000 0.47827 0.50868  0.42439 0.46035 0.46959 0.48045
## S237  0.04312 0.47827 1.00000 0.52727  0.43426 0.43935 0.40501 0.38553
## S238  0.02168 0.50868 0.52727 1.00000  0.53631 0.41918 0.74295 0.69461
## S249 -0.02928 0.42439 0.43426 0.53631  1.00000 0.39673 0.49899 0.46397
## S259  0.01932 0.46035 0.43935 0.41918  0.39673 1.00000 0.36481 0.36981
## S268  0.01764 0.46959 0.40501 0.74295  0.49899 0.36481 1.00000 0.68830
## S278  0.01945 0.48045 0.38553 0.69461  0.46397 0.36981 0.68830 1.00000
## 
## 
## free parameters:
##    name matrix    row    col Estimate Std.Error Std.Estimate  Std.SE
## 1    l0      A Factor    M37  0.02555   0.02507      0.02554 0.02506
## 2    l1      A   S227 Factor  0.61786   0.02248      0.61806 0.02249
## 3    l2      A   S237 Factor  0.58349   0.02279      0.58368 0.02279
## 4    l3      A   S238 Factor  0.87863   0.01944      0.87891 0.01944
## 5    l4      A   S249 Factor  0.62555   0.02235      0.62575 0.02236
## 6    l5      A   S259 Factor  0.51380   0.02335      0.51397 0.02336
## 7    l6      A   S268 Factor  0.82249   0.02015      0.82276 0.02015
## 8    l7      A   S278 Factor  0.78546   0.02061      0.78572 0.02062
## 9    e0      S    M37    M37  1.00000   0.03366      1.00000 0.03366
## 10   e1      S   S227   S227  0.61800   0.02248      0.61800 0.02248
## 11   e2      S   S237   S237  0.65931   0.02367      0.65931 0.02367
## 12   e3      S   S238   S238  0.22751   0.01234      0.22751 0.01234
## 13   e4      S   S249   S249  0.60843   0.02210      0.60843 0.02210
## 14   e5      S   S259   S259  0.73584   0.02589      0.73584 0.02589
## 15   e6      S   S268   S268  0.32306   0.01424      0.32306 0.01424
## 16   e7      S   S278   S278  0.38265   0.01576      0.38265 0.01576
##    lbound ubound
## 1               
## 2               
## 3               
## 4               
## 5               
## 6               
## 7               
## 8               
## 9               
## 10              
## 11              
## 12              
## 13              
## 14              
## 15              
## 16              
## 
## observed statistics:  28 
## estimated parameters:  16 
## degrees of freedom:  12 
## -2 log likelihood:  8919 
## saturated -2 log likelihood:  8513 
## number of observations:  1766 
## chi-square:  406 
## p:  2.03e-79 
## Information Criteria: 
##      df Penalty Parameters Penalty Sample-Size Adjusted
## AIC:         NA                 NA                   NA
## BIC:         NA                 NA                   NA
## CFI: 0.9308 
## TLI: 0.9031 
## RMSEA:  0.1045 
## timestamp: 2013-10-14 21:29:18 
## frontend time: 0.03482 secs 
## backend time: 0.003179 secs 
## independent submodels time: 2.146e-05 secs 
## wall clock time: 0.03802 secs 
## cpu time: 0.03802 secs 
## openmx version number: 1.3.2-2301

Comparison

We see that the goodness of fit is better for the polychoric model than the pmc model. This is expected, for the same reason we observed that the polychoric correlations are higher than the pmc correlations: they better fit the underlying data type.

The parameter estimates are slightly different. Focusing on the one for age, we see that for the PMC model it is 0.02096, and for the polychoric model it is 0.02555 (both have similar SE's). Same reason as above: better fit, so stronger coefficients. Neither of the regression coefficients is very significant, so age is not a good predictor for neuroticism.