Overview
- **Link to a Jamovi file which does the analyses for the data. This should execute on your computer “as is.” When you execute this, you will need to enable the modules because it uses the lavaan syntax module which calls R for some of the more specialized models presented in Chapter 3.
You’ll need to click “download” the file because some browsers (like Chrome) seem to think that this is a zipped file.
Galton.csv.omv file showing analyses
- Jamovi GUI (point-and-click) step-by-step for each task, for doing this on your own
- Reproducible R code (from Chatgpt) using Jamovi’s R engine (
jmv) andlavaan/semPlot.
This code also illustrates how you can run jamovi modules from within R.
Data expected:
Galton.csvwith variables namedchildHeightandmidparentHeight.
If you don’t see a menu mentioned below, click the Modules (⊕) button at the top-right of Jamovi, open jamovi library, and install the needed module (e.g., SEM (lavaan), optionally scatr for enhanced scatterplots).
childHeight and
midparentHeight.Option A (built-in via Correlation Matrix): 1. Go to Regression → Correlation Matrix. 2. Move childHeight and midparentHeight into Variables. 3. Under Options → Plots, check Scatter plot and Densities (this produces a scatter matrix with density plots on the diagonal — a convenient way to show “marginal distributions”). 4. (Optional) Under Statistics, add N, p-values, Confidence intervals.
Option B (plugin for a single scatter with marginals): 1. Install scatr (if not already): Modules (⊕) → jamovi library → scatr → Install. 2. Go to Exploration → Scatterplot (from the scatr module). 3. Set X = midparentHeight, Y = childHeight. 4. Enable options such as Smoother, Marginal densities (or Marginal histograms), and Show correlation if available.
Install/enable the module: Modules (⊕) → jamovi library → SEM (lavaan) → Install (if not present).
Path analysis via SEM (lavaan) module: 1. Go to
SEM → Structural Equation Model (SEM) (module names may
vary slightly by version). 2. In the Model or
Syntax panel, enter the model line:
childHeight ~ midparentHeight 3. Under
Options/Estimates, check Standardized
solution, R-squared, Fit
measures. 4. Under Plots (if available),
enable Path diagram and choose whether to show
Standardized or Unstandardized edge
labels. 5. Click Run (if required). The Model
Results table and Path diagram will appear. 6.
To switch between standardized and
unstandardized coefficients in the diagram, toggle the
corresponding option and rerun.
These mirror the Jamovi GUI analyses using
jmvandlavaan/semPlot.
library(jmv) # jamovi's R engine
library(ggplot2)
library(ggExtra) # for marginal distributions on scatterplots
library(dplyr)
library(tidyr)
library(lavaan) # jamovi SEM module is lavaan-based
library(semPlot) # path diagram
This file path reflects what you uploaded here. Change it to your local path when running on your machine.
# Treat the Rmd's folder as the working directory when knitting
if (!is.null(knitr::current_input())) {
knitr::opts_knit$set(root.dir = dirname(knitr::current_input()))
}
data_path <- "Galton.csv" # put the CSV alongside this Rmd
stopifnot(file.exists(data_path))
dat <- read.csv(data_path, header = TRUE, check.names = TRUE, stringsAsFactors = FALSE)
# Expected variable names in camelCase:
vars_needed <- c("childHeight", "midparentHeight")
stopifnot(all(vars_needed %in% names(dat)))
# Keep only what we need
df <- dat[, vars_needed, drop = FALSE]
df[] <- lapply(df, function(x) as.numeric(x))
# Self-checks
cat("Data rows: ", nrow(df), "\n")
## Data rows: 934
cat("Column names: ", paste(names(df), collapse = ", "), "\n")
## Column names: childHeight, midparentHeight
cat("Any NA in childHeight? ", any(is.na(df$childHeight)), "\n")
## Any NA in childHeight? FALSE
cat("Any NA in midparentHeight? ", any(is.na(df$midparentHeight)), "\n")
## Any NA in midparentHeight? FALSE
str(df)
## 'data.frame': 934 obs. of 2 variables:
## $ childHeight : num 73.2 69.2 69 69 73.5 72.5 65.5 65.5 71 68 ...
## $ midparentHeight: num 75.4 75.4 75.4 75.4 73.7 ...
summary(df)
## childHeight midparentHeight
## Min. :56.00 Min. :64.40
## 1st Qu.:64.00 1st Qu.:68.14
## Median :66.50 Median :69.25
## Mean :66.75 Mean :69.21
## 3rd Qu.:69.70 3rd Qu.:70.14
## Max. :79.00 Max. :75.43
p_scatter <- ggplot(df, aes(x = midparentHeight, y = childHeight)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "lm", se = TRUE) +
labs(x = "Mid-Parent Height", y = "Child Height",
title = "Child vs Mid-Parent Height") +
theme_minimal()
# Add text annotation for r
r_val <- cor(df$midparentHeight, df$childHeight, use = "pairwise.complete.obs")
p_scatter <- p_scatter + annotate("text",
x = Inf, y = -Inf, hjust = 1.05, vjust = -0.8,
label = paste0("r = ", round(r_val, 3)))
ggMarginal(p_scatter, type = "density", margins = "both")
Scatterplot of childHeight vs midparentHeight with marginal densities.
jmv::descriptives in R)desc <- jmv::descriptives(
data = df,
vars = c("childHeight", "midparentHeight"),
n = TRUE, mean = TRUE, median = TRUE, sd = TRUE, se = TRUE,
min = TRUE, max = TRUE, skew = TRUE, kurt = TRUE,
pcEqGr = TRUE, # not 4!
hist = TRUE, dens = TRUE, box = TRUE, qq = TRUE
)
desc
##
## DESCRIPTIVES
##
## Descriptives
## ─────────────────────────────────────────────────────────
## childHeight midparentHeight
## ─────────────────────────────────────────────────────────
## N 934 934
## Missing 0 0
## Mean 66.74593 69.20677
## Std. error mean 0.1171167 0.05897536
## Median 66.50000 69.24800
## Standard deviation 3.579251 1.802370
## Minimum 56.00000 64.40000
## Maximum 79.00000 75.43000
## Skewness 0.08486597 0.1147869
## Std. error skewness 0.08002143 0.08002143
## Kurtosis -0.4919851 0.5788043
## Std. error kurtosis 0.1598731 0.1598731
## 25th percentile 64.00000 68.14000
## 50th percentile 66.50000 69.24800
## 75th percentile 69.70000 70.14000
## ─────────────────────────────────────────────────────────
df_long <- df %>%
pivot_longer(cols = everything(), names_to = "variable", values_to = "value")
ggplot(df_long, aes(sample = value)) +
stat_qq() +
stat_qq_line() +
facet_wrap(~ variable, scales = "free") +
theme_minimal() +
labs(title = "Q–Q (Probability) Plots")
Q–Q plots for each variable.
ggplot(df_long, aes(x = variable, y = value)) +
geom_violin(trim = FALSE) +
geom_boxplot(width = 0.15, outlier.shape = NA) +
theme_minimal() +
labs(x = NULL, y = "Value", title = "Violin + Boxplots")
Violin plots for each variable.
jmv::corrMatrix in R)# Run the correlation matrix
corr <- jmv::corrMatrix(
data = df,
vars = names(df),
n = TRUE,
ci = TRUE, ciWidth = 95
)
corr
##
## CORRELATION MATRIX
##
## Correlation Matrix
## ─────────────────────────────────────────────────────────────────────
## childHeight midparentHeight
## ─────────────────────────────────────────────────────────────────────
## childHeight Pearson's r —
## df —
## p-value —
## 95% CI Upper —
## 95% CI Lower —
## N —
##
## midparentHeight Pearson's r 0.3209499 —
## df 932 —
## p-value < .0000001 —
## 95% CI Upper 0.3773285 —
## 95% CI Lower 0.2622011 —
## N 934 —
## ─────────────────────────────────────────────────────────────────────
jmv::linReg in R)linmod <- jmv::linReg(
data = df,
dep = "childHeight",
covs = "midparentHeight",
blocks = list(list("midparentHeight")),
ci = TRUE, ciWidth = 95,
stdEst = TRUE, # ✅ correct argument name
anova = TRUE,
resPlots = TRUE,
collin = TRUE
)
linmod
##
## LINEAR REGRESSION
##
## Model Fit Measures
## ───────────────────────────────────
## Model R R²
## ───────────────────────────────────
## 1 0.3209499 0.1030088
## ───────────────────────────────────
## Note. Models estimated using
## sample size of N=934
##
##
## MODEL SPECIFIC RESULTS
##
## MODEL 1
##
## Omnibus ANOVA Test
## ─────────────────────────────────────────────────────────────────────────────────────
## Sum of Squares df Mean Square F p
## ─────────────────────────────────────────────────────────────────────────────────────
## midparentHeight 1231.234 1 1231.23366 107.0292 < .0000001
## Residuals 10721.466 932 11.50372
## ─────────────────────────────────────────────────────────────────────────────────────
## Note. Type 3 sum of squares
##
##
## Model Coefficients - childHeight
## ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## Predictor Estimate SE Lower Upper t p Stand. Estimate
## ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## Intercept 22.6362405 4.26510743 14.2659135 31.0065676 5.307308 0.0000001
## midparentHeight 0.6373609 0.06160760 0.5164552 0.7582666 10.345491 < .0000001 0.3209499
## ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
##
##
## ASSUMPTION CHECKS
##
## Collinearity Statistics
## ────────────────────────────────────────────
## VIF Tolerance
## ────────────────────────────────────────────
## midparentHeight 1.000000 1.000000
## ────────────────────────────────────────────
Model: \[ \texttt{childHeight ~ midparentHeight} \]
model_sem <- "
childHeight ~ b1*midparentHeight
"
fit_sem <- lavaan::sem(model_sem, data = df, meanstructure = TRUE, std.lv = FALSE)
# Unstandardized and standardized solutions
summary(fit_sem, standardized = TRUE, fit.measures = TRUE, rsquare = TRUE)
## lavaan 0.6-19 ended normally after 1 iteration
##
## Estimator ML
## Optimization method NLMINB
## Number of model parameters 3
##
## Number of observations 934
##
## Model Test User Model:
##
## Test statistic 0.000
## Degrees of freedom 0
##
## Model Test Baseline Model:
##
## Test statistic 101.534
## Degrees of freedom 1
## P-value 0.000
##
## User Model versus Baseline Model:
##
## Comparative Fit Index (CFI) 1.000
## Tucker-Lewis Index (TLI) 1.000
##
## Loglikelihood and Information Criteria:
##
## Loglikelihood user model (H0) -2465.015
## Loglikelihood unrestricted model (H1) -2465.015
##
## Akaike (AIC) 4936.029
## Bayesian (BIC) 4950.548
## Sample-size adjusted Bayesian (SABIC) 4941.020
##
## Root Mean Square Error of Approximation:
##
## RMSEA 0.000
## 90 Percent confidence interval - lower 0.000
## 90 Percent confidence interval - upper 0.000
## P-value H_0: RMSEA <= 0.050 NA
## P-value H_0: RMSEA >= 0.080 NA
##
## Standardized Root Mean Square Residual:
##
## SRMR 0.000
##
## Parameter Estimates:
##
## Standard errors Standard
## Information Expected
## Information saturated (h1) model Structured
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## childHeight ~
## mdprntHgh (b1) 0.637 0.062 10.357 0.000 0.637 0.321
##
## Intercepts:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .childHeight 22.636 4.261 5.313 0.000 22.636 6.328
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .childHeight 11.479 0.531 21.610 0.000 11.479 0.897
##
## R-Square:
## Estimate
## childHeight 0.103
# Parameter estimates table
pe <- parameterEstimates(fit_sem, standardized = TRUE)
coef_tbl <- pe %>%
filter(op == "~") %>%
select(
lhs, op, rhs,
est, se, z, pvalue,
std.all
) %>%
rename(
unstd_est = est,
unstd_se = se,
z_value = z,
p_value = pvalue,
std_est = std.all
)
coef_tbl
semPlot::semPaths(
fit_sem,
what = "std", # show standardized by default
whatLabels = "std", # label edges with standardized estimates
layout = "tree",
edge.label.cex = 1.1,
nCharNodes = 0,
sizeMan = 8,
sizeLat = 10,
residuals = TRUE,
intercepts = FALSE,
style = "lisrel"
)
Path diagram for childHeight ~ midparentHeight
To see unstandardized labels instead, set
whatLabels = "est"above (or draw a second diagram).
Paste the following into SEM → Model/Syntax in Jamovi and run:
childHeight ~ midparentHeight
Enable Standardized solution, Fit measures, R-squared, and Path diagram to match the R output.
Galton.csv with variables
childHeight and midparentHeight.jmv functions used for
Descriptives, Correlations, and Linear Regression.lavaan/semPlot used to mirror
Jamovi’s SEM module.ggplot2 +
ggExtra for scatterplot with marginal densities;
ggplot2 for Q–Q and violin plots.