Need to set up mplus as an engine for knitr. I am hacking my way through this:
Code chunks in my Rmd
You might not want the Stata line
knitr::opts_chunk$set(engine.path = list(
stata = "/Applications/Stata/StataMP.app/Contents/MacOS/stata-mp" ,
mplus = "/Applications/Mplus/mplus"
))
knitr::knit_engines$set(mplus = function(options) {
code <- paste(options$code, collapse = "\n")
fileConn<-file("formplus.inp")
writeLines(code, fileConn)
close(fileConn)
out <- system2("/Applications/Mplus/mplus", "formplus.inp")
fileConnOutput <- file("formplus.out")
mplusOutput <- readLines(fileConnOutput)
knitr::engine_output(options, code, mplusOutput)
})
MplusAutomation to prepare dataFirst get some data
#install.packages("palmerpenguins")
library(palmerpenguins)
df <- penguins
str(df)
## tibble [344 × 8] (S3: tbl_df/tbl/data.frame)
## $ species : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ island : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
## $ bill_length_mm : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
## $ bill_depth_mm : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
## $ flipper_length_mm: int [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
## $ body_mass_g : int [1:344] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...
## $ sex : Factor w/ 2 levels "female","male": 2 1 1 NA 1 2 1 2 NA NA ...
## $ year : int [1:344] 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...
df <- df[,c("body_mass_g","flipper_length_mm")]
str(df)
## tibble [344 × 2] (S3: tbl_df/tbl/data.frame)
## $ body_mass_g : int [1:344] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...
## $ flipper_length_mm: int [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
Now load up MplusAutomation package
#install.packages("MplusAutomation")
library(MplusAutomation)
## Version: 0.8
## We work hard to write this free software. Please help us get credit by citing:
##
## Hallquist, M. N. & Wiley, J. F. (2018). MplusAutomation: An R Package for Facilitating Large-Scale Latent Variable Analyses in Mplus. Structural Equation Modeling, 25, 621-638. doi: 10.1080/10705511.2017.1402334.
##
## -- see citation("MplusAutomation").
MplusAutomation::prepareMplusData(df,"pp.dat",interactive="FALSE")
## The file(s)
## 'pp.dat'
## currently exist(s) and will be overwritten
## TITLE: Your title goes here
## DATA: FILE = "pp.dat";
## VARIABLE:
## NAMES = body_mass_g flipper_length_mm;
## MISSING=.;
Here’s a picture of what the next chunk of my rmd file looks like
Figure 1. Isn’t it amazing
Now specify and run a simple Mplus model in mplus/Rmarkdown
About variable names. The prepared pp.dat file is just data, no variable names just like Mplus wants it. Programmer has to keep track of the order, and that is why it is so nice that MplusAutomation::prepareMplusData produces that shell of an Mplus input file. But importantly, Mplus will accept data with >8 characters but will only report the first 8 in output. It is therefore better to use short variable names when working with Mplus. I am just going to use different names in the code below that are short. Mplus will also make these all uppercase in output.
TITLE: Palmer penguins body mass and flipper length
DATA: FILE = pp.dat;
VARIABLE: NAMES = weight flipper ;
MISSING = . ;
MODEL: weight on flipper ;
## Mplus VERSION 8.5 (Mac)
## MUTHEN & MUTHEN
## 02/26/2021 11:15 AM
##
## INPUT INSTRUCTIONS
##
## TITLE: Palmer penguins body mass and flipper length
## DATA: FILE = pp.dat;
## VARIABLE: NAMES = weight flipper ;
## MISSING = . ;
## MODEL: weight on flipper ;
##
##
##
## *** WARNING
## Data set contains cases with missing on all variables.
## These cases were not included in the analysis.
## Number of cases with missing on all variables: 2
## 1 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS
##
##
##
## Palmer penguins body mass and flipper length
##
## SUMMARY OF ANALYSIS
##
## Number of groups 1
## Number of observations 342
##
## Number of dependent variables 1
## Number of independent variables 1
## Number of continuous latent variables 0
##
## Observed dependent variables
##
## Continuous
## WEIGHT
##
## Observed independent variables
## FLIPPER
##
##
## Estimator ML
## Information matrix OBSERVED
## Maximum number of iterations 1000
## Convergence criterion 0.500D-04
## Maximum number of steepest descent iterations 20
## Maximum number of iterations for H1 2000
## Convergence criterion for H1 0.100D-03
##
## Input data file(s)
## pp.dat
##
## Input data format FREE
##
##
## SUMMARY OF DATA
##
## Number of missing data patterns 1
##
##
## COVARIANCE COVERAGE OF DATA
##
## Minimum covariance coverage value 0.100
##
##
## PROPORTION OF DATA PRESENT
##
##
## Covariance Coverage
## WEIGHT FLIPPER
## ________ ________
## WEIGHT 1.000
## FLIPPER 1.000 1.000
##
##
##
## UNIVARIATE SAMPLE STATISTICS
##
##
## UNIVARIATE HIGHER-ORDER MOMENT DESCRIPTIVE STATISTICS
##
## Variable/ Mean/ Skewness/ Minimum/ % with Percentiles
## Sample Size Variance Kurtosis Maximum Min/Max 20%/60% 40%/80% Median
##
## WEIGHT 4201.754 0.468 2700.000 0.29% 3450.000 3800.000 4050.000
## 342.000 641250.577 -0.726 6300.000 0.29% 4300.000 4950.000
## FLIPPER 200.915 0.344 172.000 0.29% 188.000 194.000 197.000
## 342.000 197.154 -0.987 231.000 0.29% 203.000 215.000
##
##
## THE MODEL ESTIMATION TERMINATED NORMALLY
##
##
##
## MODEL FIT INFORMATION
##
## Number of Free Parameters 3
##
## Loglikelihood
##
## H0 Value -2528.427
## H1 Value -2528.427
##
## Information Criteria
##
## Akaike (AIC) 5062.855
## Bayesian (BIC) 5074.359
## Sample-Size Adjusted BIC 5064.843
## (n* = (n + 2) / 24)
##
## Chi-Square Test of Model Fit
##
## Value 0.000
## Degrees of Freedom 0
## P-Value 0.0000
##
## RMSEA (Root Mean Square Error Of Approximation)
##
## Estimate 0.000
## 90 Percent C.I. 0.000 0.000
## Probability RMSEA <= .05 0.000
##
## CFI/TLI
##
## CFI 1.000
## TLI 1.000
##
## Chi-Square Test of Model Fit for the Baseline Model
##
## Value 486.641
## Degrees of Freedom 1
## P-Value 0.0000
##
## SRMR (Standardized Root Mean Square Residual)
##
## Value 0.000
##
##
##
## MODEL RESULTS
##
## Two-Tailed
## Estimate S.E. Est./S.E. P-Value
##
## WEIGHT ON
## FLIPPER 49.685 1.514 32.818 0.000
##
## Intercepts
## WEIGHT -5780.780 304.917 -18.959 0.000
##
## Residual Variances
## WEIGHT ********* 11818.094 13.077 0.000
##
##
## QUALITY OF NUMERICAL RESULTS
##
## Condition Number for the Information Matrix 0.579E-03
## (ratio of smallest to largest eigenvalue)
##
##
## Beginning Time: 11:15:48
## Ending Time: 11:15:48
## Elapsed Time: 00:00:00
##
##
##
## MUTHEN & MUTHEN
## 3463 Stoner Ave.
## Los Angeles, CA 90066
##
## Tel: (310) 391-9971
## Fax: (310) 391-8971
## Web: www.StatModel.com
## Support: Support@StatModel.com
##
## Copyright (c) 1998-2020 Muthen & Muthen
Check model in R
model <- glm(body_mass_g ~ flipper_length_mm , data=df)
summary(model)
##
## Call:
## glm(formula = body_mass_g ~ flipper_length_mm, data = df)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1058.80 -259.27 -26.88 247.33 1288.69
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5780.831 305.815 -18.90 <2e-16 ***
## flipper_length_mm 49.686 1.518 32.72 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 155455.3)
##
## Null deviance: 219307697 on 341 degrees of freedom
## Residual deviance: 52854796 on 340 degrees of freedom
## (2 observations deleted due to missingness)
## AIC: 5062.9
##
## Number of Fisher Scoring iterations: 2
Kind of interesting they are not exactly the same.
Now I use the MplusAutomation::readModels to get useful information from the output back to R
penguinsResults <- MplusAutomation::readModels(target="formplus.out")
## Reading model: formplus.out
summary(penguinsResults)
## Palmer penguins body mass and flipper lengthEstimated using ML
## Number of obs: 342, number of (free) parameters: 3
##
## Model: Chi2(df = 0) = 0, p = 0
## Baseline model: Chi2(df = 1) = 486.641, p = 0
##
## Fit Indices:
##
## CFI = 1, TLI = 1, SRMR = 0
## RMSEA = 0, 90% CI [0, 0], p < .05 = 0
## AIC = 5062.855, BIC = 5074.359
print(penguinsResults$parameters)
## $unstandardized
## paramHeader param est se est_se pval
## 1 WEIGHT.ON FLIPPER 49.685 1.514 32.818 0
## 2 Intercepts WEIGHT -5780.78 304.917 -18.959 0
## 3 Residual.Variances WEIGHT ********* 11818.094 13.077 0