This report investigates the relationship between workplace factors and psychological stress using survey data. Three predictors are examined:
Stress is measured using a DASS-derived stress subscale (higher = more stressed).
library(tidyverse)
library(mgcv)
library(DHARMa)
library(gamlss)
library(gamlss.dist)
library(ggfortify)
library(mgcViz) ## Rows: 5,000
## Columns: 162
## $ Age <chr> "26-35", "56+", "18-25", "36-45", "26-35", "18-25", "56+"…
## $ Gender <chr> "Female", "Prefer not to say", "Female", "Male", "Female"…
## $ Industry <chr> "Consulting", "Education", "Finance", "Healthcare", "Cons…
## $ JobRole <chr> "Marketing", "Software Engineer", "HR", "Designer", "Data…
## $ Region <chr> "Africa", "South America", "Asia", "Africa", "North Ameri…
## $ WorkLocation <chr> "Remote", "Onsite", "Onsite", "Remote", "Onsite", "Onsite…
## $ AccessMH <chr> "Yes", "No", "No", "Yes", "Yes", "Yes", "Yes", "Yes", "No…
## $ wp1 <chr> "Sometimes", "Sometimes", "Often", "Sometimes", "Sometime…
## $ wp2 <chr> "Often", "Never", "Often", "Never", "Regularly", "Regular…
## $ wp3 <chr> "Very often", "Sometimes", "Sometimes", "Sometimes", "Reg…
## $ wp4 <chr> "Regularly", "Never", "Often", "Regularly", "Regularly", …
## $ cogn1 <chr> "Sometimes", "Often", "Often", "Sometimes", "Sometimes", …
## $ cogn2 <chr> "Never", "Regularly", "Never", "Sometimes", "Often", "Reg…
## $ cogn3 <chr> "Regularly", "Often", "Very often", "Regularly", "Often",…
## $ cogn4 <chr> "Often", "Very often", "Very often", "Never", "Regularly"…
## $ emo1 <chr> "Often", "Regularly", "Very often", "Regularly", "Very of…
## $ emo2 <chr> "Very often", "Regularly", "Often", "Often", "Very often"…
## $ emo3 <chr> "Very often", "Often", "Often", "Very often", NA, "Someti…
## $ emo4 <chr> "Very often", NA, "Regularly", "Never", "Very often", "Re…
## $ emo5 <chr> "Often", "Regularly", "Sometimes", "Regularly", "Never", …
## $ emo6 <chr> "Very often", "Never", "Sometimes", "Regularly", "Regular…
## $ rolcon1 <chr> "Strongly agree", "Neutral", "Agree", "Agree", "Strongly …
## $ rolcon2 <chr> "Strongly agree", "Neutral", "Disagree", "Strongly disagr…
## $ rolcon3 <chr> "Strongly agree", "Neutral", "Agree", "Disagree", "Agree"…
## $ rolcon4 <chr> "Strongly agree", "Agree", "Neutral", "Disagree", "Neutra…
## $ hassle1 <chr> "Strongly agree", "Neutral", "Strongly agree", "Strongly …
## $ hassle2 <chr> "Neutral", "Strongly agree", "Strongly agree", "Strongly …
## $ hassle3 <chr> "Strongly agree", "Agree", "Neutral", "Disagree", "Disagr…
## $ hassle4 <chr> "Strongly agree", "Agree", "Strongly agree", "Strongly di…
## $ hassle5 <chr> "Agree", "Disagree", "Disagree", "Strongly disagree", "Di…
## $ auto1 <chr> "Sometimes", "Never", "Regularly", "Often", "Sometimes", …
## $ auto2 <chr> "Never", "Sometimes", "Sometimes", "Never", "Regularly", …
## $ auto3 <chr> "Never", "Never", "Sometimes", "Regularly", "Sometimes", …
## $ soc1 <chr> "Very often", "Sometimes", "Sometimes", "Regularly", "Oft…
## $ soc2 <chr> "Often", "Sometimes", "Regularly", "Very often", "Regular…
## $ soc3 <chr> "Very often", "Sometimes", "Regularly", "Very often", "Of…
## $ feedb1 <chr> "Sometimes", "Often", "Never", "Sometimes", "Very often",…
## $ feedb2 <chr> "Sometimes", "Sometimes", "Sometimes", "Sometimes", "Neve…
## $ feedb3 <chr> "Regularly", "Often", "Sometimes", "Never", "Sometimes", …
## $ coach1 <chr> "Sometimes", "Never", "Often", "Regularly", "Sometimes", …
## $ coach2 <chr> "Regularly", "Regularly", "Regularly", "Regularly", "Neve…
## $ coach3 <chr> "Sometimes", "Sometimes", "Sometimes", "Sometimes", "Neve…
## $ coach4 <chr> "Sometimes", "Sometimes", "Often", "Regularly", "Never", …
## $ coach5 <chr> "Sometimes", "Very often", "Regularly", "Regularly", "Reg…
## $ SE1 <chr> "Absolutely wrong", "Absolutely right", "Barely right", "…
## $ SE2 <chr> "Absolutely wrong", "Somewhat right", "Absolutely right",…
## $ SE3 <chr> "Absolutely wrong", "Absolutely right", "Absolutely wrong…
## $ SE4 <chr> "Barely right", "Absolutely right", "Barely right", "Abso…
## $ optim1 <chr> "Neutral", "Disagree", "Agree", "Disagree", "Agree", "Dis…
## $ optim2 <chr> "Agree", "Neutral", "Disagree", "Agree", "Neutral", "Neut…
## $ optim3 <chr> "Neutral", "Disagree", "Disagree", "Disagree", "Totally d…
## $ optim4 <chr> "Agree", "Disagree", "Totally disagree", "Agree", "Agree"…
## $ ERI1 <chr> "Disagree", "Agree", "Disagree", "Agree", "Agree", "Disag…
## $ ERI2 <chr> "Disagree", "Agree", "Strongly disagree", "Disagree", "St…
## $ ERI3 <chr> "Disagree", "Disagree", "Disagree", "Strongly agree", "Ag…
## $ ERI4 <chr> "Strongly disagree", "Agree", "Disagree", "Strongly agree…
## $ ERI5 <chr> "Strongly agree", "Strongly disagree", "Strongly agree", …
## $ ERI6 <chr> "Strongly disagree", "Strongly agree", "Strongly disagree…
## $ ERI7 <chr> "Strongly agree", "Disagree", "Strongly agree", "Strongly…
## $ ERI8 <chr> "Strongly disagree", "Agree", "Disagree", "Strongly agree…
## $ ERI9 <chr> "Strongly disagree", "Strongly agree", "Strongly disagree…
## $ ERI10 <chr> "Strongly disagree", "Strongly agree", "Strongly disagree…
## $ OC1 <chr> "Strongly disagree", "Strongly agree", "Strongly disagree…
## $ OC2 <chr> "Disagree", "Strongly agree", "Agree", "Agree", "Disagree…
## $ OC3 <chr> "Strongly disagree", "Strongly disagree", "Strongly agree…
## $ OC4 <chr> "Disagree", "Agree", "Strongly disagree", "Agree", "Disag…
## $ OC5 <chr> "Strongly disagree", "Strongly agree", "Strongly disagree…
## $ OC6 <chr> "Strongly disagree", "Agree", "Strongly disagree", "Disag…
## $ PJF1 <chr> "No match", "Complete match", "Complete match", "Somewhat…
## $ PJF2 <chr> "Somewhat match", "Slight match", "Moderate match", "Mode…
## $ PJF3 <chr> "Strong match", "Somewhat match", "Good match", "Moderate…
## $ PJF4 <chr> "Good match", "Complete match", "Slight match", "Slight m…
## $ POF1 <chr> "Strong match", "Strong match", "Slight match", "No match…
## $ POF2 <chr> "Moderate match", "Good match", "Good match", "No match",…
## $ POF3 <chr> "Moderate match", "Complete match", "Strong match", "Good…
## $ POF4 <chr> "Complete match", "Strong match", "Strong match", "No mat…
## $ PSF1 <chr> "Complete match", "Moderate match", "No match", "Strong m…
## $ PSF2 <chr> "Good match", "Strong match", "No match", "Good match", "…
## $ PSF3 <chr> "Good match", "Strong match", "No match", "Good match", "…
## $ PSF4 <chr> "Good match", "Somewhat match", "Slight match", "Good mat…
## $ PSF5 <chr> "Good match", "Slight match", "Moderate match", "Good mat…
## $ CORS1 <chr> "Some Gain", "Some Gain", "Some Gain", "Some Gain", "No C…
## $ CORS2 <chr> "Great Gain", "Great Gain", "Great Loss", "Great Gain", "…
## $ CORS3 <chr> "No Change", "Great Loss", "No Change", "Great Gain", "So…
## $ CORS4 <chr> "Some Gain", "No Change", "Some Gain", "Great Loss", "No …
## $ CORS5 <chr> "Some Loss", "Great Gain", "Some Gain", "Great Loss", "No…
## $ CORS6 <chr> "Great Loss", "Some Gain", "Some Loss", "Some Loss", "Som…
## $ CORS7 <int> 2, 4, 4, 1, 4, 4, 4, 3, 5, 1, 4, 1, 2, 5, 5, 1, 2, 5, 4, …
## $ CORS8 <int> 2, 4, 1, 2, 4, 5, 3, 5, 3, 2, 2, 2, 2, 4, 3, 1, 2, 5, 2, …
## $ CORS9 <int> 4, 3, 2, 1, 2, 3, 4, 4, 4, 4, 4, 4, 1, 2, 5, 5, 3, 5, 2, …
## $ ELT1 <chr> "Somewhat agree", "Neutral", "Strongly agree", "Disagree"…
## $ ELT2 <chr> "Strongly agree", "Somewhat disagree", "Somewhat disagree…
## $ ELT3 <chr> "Neutral", "Neutral", "Agree", "Disagree", "Neutral", "Ne…
## $ ELT4 <chr> "Agree", "Disagree", "Strongly agree", "Neutral", "Strong…
## $ CAS1 <int> 1, 2, 0, 0, 0, 0, 1, 6, 5, 2, 3, 5, 0, 0, 6, 2, 1, 6, 2, …
## $ CAS2 <int> 3, 2, 0, 0, 1, 0, 1, 6, 1, 6, 0, 2, 3, 1, 5, 1, 0, 5, 3, …
## $ CAS3 <int> 2, 5, 1, 0, 2, 0, 1, 5, 4, 6, 3, 5, 0, 1, 6, 3, 4, 5, 1, …
## $ CAS4 <int> 3, 6, 6, 5, 3, 6, 2, 1, 0, 1, 5, 1, 6, 4, 0, 2, 4, 0, 6, …
## $ CAS5 <int> 4, 1, 4, 6, 1, 6, 6, 0, 3, 0, 6, 3, 6, 5, 0, 2, 4, 0, 5, …
## $ CAS6 <int> 2, 4, 0, 1, 3, 0, 1, 6, 5, 4, 2, 3, 3, 1, 6, 2, 4, 3, 0, …
## $ CAS7 <int> 5, 3, 0, 2, 0, 1, 1, 6, 5, 6, 3, 4, 1, 0, 6, 0, 1, 6, 0, …
## $ CAS8 <int> 6, 0, 0, 0, 1, 0, 0, 6, 5, 6, 2, 6, 2, 0, 6, 1, 5, 4, 3, …
## $ CAS9 <int> 3, 3, 0, 1, 1, 0, 1, 6, 2, 5, 4, 2, 1, 0, 4, 5, 4, 4, 2, …
## $ CAS10 <int> 2, 3, 0, 0, 1, 0, 4, 6, 5, 2, 2, 3, 1, 4, 5, 3, 2, 4, 4, …
## $ CAS11 <int> 5, 3, 1, 0, 1, 0, 3, 6, 5, 2, 0, 5, 0, 2, 6, 1, 1, 6, 0, …
## $ CAS12 <int> 1, 5, 0, 0, 6, 0, 0, 6, 3, 6, 2, 4, 1, 4, 6, 6, 0, 5, 0, …
## $ SS1 <chr> "Agree", "Neutral", "Strongly agree", "Neutral", "Strongl…
## $ SS2 <chr> "Disagree", "Neutral", "Agree", "Strongly agree", "Neutra…
## $ SS3 <chr> "Strongly agree", "Disagree", "Strongly agree", "Neutral"…
## $ CS1 <chr> "Neutral", "Neutral", "Disagree", "Strongly disagree", "D…
## $ CS2 <chr> "Disagree", "Disagree", "Neutral", "Strongly agree", "Neu…
## $ CS3 <chr> "Disagree", "Agree", "Disagree", "Agree", "Strongly disag…
## $ OS1 <chr> "Agree", "Neutral", "Strongly agree", "Strongly disagree"…
## $ OS2 <chr> "Neutral", "Agree", "Strongly agree", "Strongly disagree"…
## $ OS3 <chr> "Strongly agree", "Agree", "Strongly agree", "Disagree", …
## $ FS1 <chr> "Disagree", "Disagree", "Agree", "Neutral", "Disagree", "…
## $ FS2 <chr> "Neutral", "Agree", "Strongly agree", "Strongly agree", "…
## $ FS3 <chr> "Strongly agree", "Agree", "Strongly agree", "Disagree", …
## $ auton1 <chr> "Disagree", "Agree", "Disagree", "Disagree", "Strongly ag…
## $ auton2 <chr> "Strongly disagree", "Agree", "Disagree", "Strongly disag…
## $ auton3 <chr> "Disagree", "Disagree", "Strongly disagree", "Strongly di…
## $ comp1 <chr> "Agree", "Disagree", "Neutral", "Disagree", "Agree", "Str…
## $ comp2 <chr> "Neutral", "Strongly disagree", "Strongly agree", "Strong…
## $ comp3 <chr> "Agree", "Strongly disagree", "Neutral", "Disagree", "Neu…
## $ Relat1 <chr> "Strongly agree", "Agree", "Neutral", "Neutral", "Strongl…
## $ Relat2 <chr> "Strongly agree", "Strongly disagree", "Neutral", "Strong…
## $ Relat3 <chr> "Disagree", "Strongly disagree", "Strongly agree", "Stron…
## $ S1 <chr> "Applied to me very much or most of the time", "Applied t…
## $ S2 <chr> "Applied to me to some degree, or some of the time", "App…
## $ S3 <chr> "Applied to me very much or most of the time", "Applied t…
## $ S4 <chr> "Applied to me very much or most of the time", "Applied t…
## $ S5 <chr> "Applied to me very much or most of the time", "Applied t…
## $ A1 <chr> "Applied to me very much or most of the time", "Applied t…
## $ A2 <chr> "Applied to me to a considerable degree or a good part of…
## $ A3 <chr> "Applied to me very much or most of the time", "Applied t…
## $ A4 <chr> "Applied to me very much or most of the time", "Applied t…
## $ A5 <chr> "Applied to me very much or most of the time", "Applied t…
## $ D1 <chr> "Applied to me very much or most of the time", "Applied t…
## $ D2 <chr> "Did not apply to me at all", "Applied to me very much or…
## $ D3 <chr> "Applied to me very much or most of the time", "Applied t…
## $ D4 <chr> "Applied to me to a considerable degree or a good part of…
## $ SFQ1 <chr> "No more than usual", "No more than usual", "Much more th…
## $ SFQ2 <chr> "Much more than usual", "Worse than usual", "Much more th…
## $ SFQ3 <chr> "Much more than usual", "Worse than usual", "Worse than u…
## $ SFQ4 <chr> "Worse than usual", "Worse than usual", "Not at all", "Mu…
## $ SFQ5 <chr> "Much more than usual", "Worse than usual", "No more than…
## $ SFQ6 <chr> "Much more than usual", "Worse than usual", "Not at all",…
## $ SFQ7 <chr> "Worse than usual", "Not at all", "No more than usual", "…
## $ SFQ8 <chr> "Not at all", "No more than usual", "Worse than usual", "…
## $ SFQ9 <chr> "Worse than usual", "Much more than usual", "No more than…
## $ EE1 <chr> "Every day", "A few times per week", "Never", "A few time…
## $ EE2 <chr> "A few times per week", "A few times per week", "Once a w…
## $ EE3 <chr> "Every day", "Every day", "Never", "Once a week", "Every …
## $ EE4 <chr> "A few times per year", "Every day", "Once a week", "A fe…
## $ EE5 <chr> "Once a week", "A few times per week", "Never", "Once a w…
## $ EE6 <chr> "Every day", "Once a week", "A few times per month", "Onc…
## $ EE7 <chr> "Every day", "A few times per week", "Never", "Every day"…
## $ WB1 <chr> "Most of the time", "Most of the time", "More than half o…
## $ WB2 <chr> "Less than half of the time", "Most of the time", "Some o…
## $ WB3 <chr> "All of the time", "More than half of the time", "At no t…
## $ WB4 <chr> "Most of the time", "More than half of the time", "Less t…
## $ WB5 <chr> "Most of the time", "Most of the time", "Some of the time…
The dataset contains survey responses. The outcome variable
stress_scoreis a continuous score derived from the DASS scale. The three predictors —work_pressure,autonomy, andsocial_support— are composite scores from Likert-scale items. Because they are averages of discrete ratings, they take only a limited number of values (e.g. 0, 0.5, 1, 1.5…). This means the data will appear as columns of stacked dots in a plain scatterplot — which is why we usegeom_jitter()later.
Response scales are converted from text labels to numeric values before composite scores are computed.
freq_map <- c(
"Never" = 0,
"Sometimes" = 1,
"Regularly" = 2,
"Often" = 3,
"Very often" = 4
)
agree_map <- c(
"Strongly disagree" = 1,
"Disagree" = 2,
"Neutral" = 3,
"Agree" = 4,
"Strongly agree" = 5
)
dass_map <- c(
"Did not apply to me at all" = 0,
"Applied to me to some degree, or some of the time" = 1,
"Applied to me to a considerable degree or a good part of the time" = 2,
"Applied to me very much or most of the time" = 3
)
stress_clean <- stress %>%
mutate(
across(c(wp1, wp2, wp3, wp4), ~ freq_map[.x]),
across(c(auto1, auto2, auto3), ~ freq_map[.x]),
across(c(SS1, SS2, SS3), ~ agree_map[.x]),
across(c(S1, S2, S3, S4, S5), ~ dass_map[.x])
) %>%
mutate(
work_pressure = rowMeans(across(c(wp1, wp2, wp3, wp4)), na.rm = TRUE),
autonomy = rowMeans(across(c(auto1, auto2, auto3)), na.rm = TRUE),
social_support = rowMeans(across(c(SS1, SS2, SS3)), na.rm = TRUE),
stress_score = rowMeans(across(c(S1, S2, S3, S4, S5)), na.rm = TRUE)
) %>%
select(stress_score, work_pressure, autonomy, social_support,
Age, Gender, WorkLocation) %>%
drop_na()
summary(stress_clean)## stress_score work_pressure autonomy social_support
## Min. :0.000 Min. :0.000 Min. :0.000 Min. :1.000
## 1st Qu.:0.800 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:2.000
## Median :1.400 Median :2.000 Median :2.000 Median :3.000
## Mean :1.501 Mean :1.999 Mean :1.999 Mean :2.999
## 3rd Qu.:2.200 3rd Qu.:2.875 3rd Qu.:3.000 3rd Qu.:4.000
## Max. :3.000 Max. :4.000 Max. :4.000 Max. :5.000
## Age Gender WorkLocation
## Length:4979 Length:4979 Length:4979
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
After recoding and computing row-wise means, the dataset is clean with no missing values.
stress_scoreranges from 0 to 3 (DASS scale), and the three predictors range from 0 to their respective Likert maximums. These summary statistics give us a baseline before modelling.
#Visualizing
Before fitting any model, we visualize the raw relationships. i wanted to apply different shades of pink, so it looks beautiful to all my plots.
scatter.wp <- stress_clean %>%
ggplot(
aes(
x = work_pressure,
y = stress_score)) +
geom_jitter(
alpha = .25,
color = "#e91e8c",
size = 2,
width = 0.05,
height = 0.05) +
labs(
x = "Work Pressure",
y = "Stress Score",
title = "Work Pressure x Stress Association")
scatter.wpThe scatterplot shows a clear upward trend — as work pressure increases, stress scores tend to rise. The relationship does not look perfectly straight, suggesting a nonlinear (curved) association. This confirms we should use a GAM with
s()rather than a simple linear model.
scatter.auto <- stress_clean %>%
ggplot(
aes(
x = autonomy,
y = stress_score)) +
geom_jitter(
alpha = .25,
color = "#e91e8c",
size = 2,
width = 0.05,
height = 0.05) +
labs(
x = "Autonomy",
y = "Stress Score",
title = "Autonomy x Stress Association")
scatter.autoThe scatterplot for autonomy shows a downward trend — respondents with higher autonomy (more control over their work) tend to report lower stress. Again, the relationship has some curvature, supporting the use of a smooth GAM term.
We fit a simple linear model first as a baseline. This assumes all relationships are straight lines.
fit.lm <- lm(
stress_score ~ work_pressure + autonomy + social_support,
data = stress_clean
)
summary(fit.lm)##
## Call:
## lm(formula = stress_score ~ work_pressure + autonomy + social_support,
## data = stress_clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.89827 -0.66094 -0.00135 0.66909 1.83102
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.331834 0.044768 29.750 < 2e-16 ***
## work_pressure 0.156783 0.010926 14.349 < 2e-16 ***
## autonomy -0.080257 0.010564 -7.597 3.6e-14 ***
## social_support 0.005255 0.010176 0.516 0.606
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.838 on 4975 degrees of freedom
## Multiple R-squared: 0.05093, Adjusted R-squared: 0.05036
## F-statistic: 88.99 on 3 and 4975 DF, p-value: < 2.2e-16
The linear model shows that all three predictors are statistically significant (p < .05). Work pressure has a positive coefficient — higher pressure = higher stress. Autonomy and social support both have negative coefficients — they are protective factors that reduce stress. However, the adjusted R-squared tells us how much variance the model explains. If it is below ~0.5, the model is missing something — likely the nonlinear patterns we saw in the scatterplots.
The diagnostic plots for the linear model are important. In the Residuals vs Fitted plot, we want to see a flat red line with no pattern. In the QQ plot, we want points to follow the diagonal line closely. If either shows clear curvature or S-shapes, it means the linear model is not capturing the true pattern in the data — and we need a more flexible model like GAM.
Why GAM and not lm()? — The scatterplots showed curved,
nonlinear patterns. A regular lm() only draws straight
lines. gam() draws wiggly curves using s()
(smooth spline terms) that follow the shape of the data.
fit.gam <- gam(
stress_score ~
s(work_pressure) +
s(autonomy) +
s(social_support),
data = stress_clean,
method = "REML"
)
summary(fit.gam)##
## Family: gaussian
## Link function: identity
##
## Formula:
## stress_score ~ s(work_pressure) + s(autonomy) + s(social_support)
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.50053 0.01187 126.4 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(work_pressure) 1.001 1.001 205.976 <2e-16 ***
## s(autonomy) 1.773 2.221 26.717 <2e-16 ***
## s(social_support) 1.002 1.003 0.285 0.595
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.0506 Deviance explained = 5.14%
## -REML = 6196.8 Scale est. = 0.70205 n = 4979
How to read the summary:
EDF (Estimated Degrees of Freedom): An EDF of 1.0 means the relationship is basically a straight line. An EDF well above 1 (e.g. 3, 5, 7) means the relationship is genuinely curved. Look at the
s(work_pressure),s(autonomy), ands(social_support)rows.P-values: A p-value below .05 means the smooth curve for that predictor is statistically significant — the relationship is real, not due to random chance.
R-squared / Deviance explained: Higher values mean the model explains more of the variation in stress scores. Compare this to the linear model’s R-squared — if GAM is higher, the nonlinear approach is better.
Each panel shows the smooth curve for one predictor, with a 95% confidence band (shaded pink). The x-axis is the predictor value and the y-axis shows its contribution to the stress score (centred at zero). An upward curve means that predictor increases stress; a downward curve means it reduces stress. If the confidence band is narrow, we have precise estimates. If it is wide, there is more uncertainty.
How to read the DHARMa plots:
There are two panels. The left panel (QQ plot) should have points following the diagonal line closely. Deviation means the model’s distributional assumptions are violated. The right panel shows residuals against predicted values — we want a flat, uniform band with no patterns.
Dispersion test and Outlier test p-values are printed above the plots. If p < .05, there is a problem. If both are non-significant (p > .05), the model fits well.
If DHARMa shows problems with the Gaussian GAM (e.g. significant dispersion test), it means the model’s assumptions about the distribution of residuals are wrong — and we should try a more flexible model like GAMLSS.
Why GAMLSS? — The GAM assumes that variance is constant across all fitted values. But in real data, stressed people may vary a lot more than non-stressed people. GAMLSS lets the variance itself change as a function of predictors — making it more honest about uncertainty.
fit.gamlss <- gamlss(
stress_score ~ pb(work_pressure) + pb(autonomy) + pb(social_support),
sigma.fo = ~ pb(work_pressure),
family = NO,
data = stress_clean
)## GAMLSS-RS iteration 1: Global Deviance = 12363.06
## GAMLSS-RS iteration 2: Global Deviance = 12363.06
## ******************************************************************
## Family: c("NO", "Normal")
##
## Call: gamlss(formula = stress_score ~ pb(work_pressure) +
## pb(autonomy) + pb(social_support), sigma.formula = ~pb(work_pressure),
## family = NO, data = stress_clean)
##
## Fitting method: RS()
##
## ------------------------------------------------------------------
## Mu link function: identity
## Mu Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.331219 0.044643 29.819 < 2e-16 ***
## pb(work_pressure) 0.156846 0.010918 14.366 < 2e-16 ***
## pb(autonomy) -0.080228 0.010556 -7.600 3.51e-14 ***
## pb(social_support) 0.005399 0.010170 0.531 0.596
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------------------------------
## Sigma link function: log
## Sigma Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.191692 0.021083 -9.092 <2e-16 ***
## pb(work_pressure) 0.007141 0.009281 0.769 0.442
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## ------------------------------------------------------------------
## NOTE: Additive smoothing terms exist in the formulas:
## i) Std. Error for smoothers are for the linear effect only.
## ii) Std. Error for the linear terms maybe are not accurate.
## ------------------------------------------------------------------
## No. of observations in the fit: 4979
## Degrees of Freedom for the fit: 6.820533
## Residual Deg. of Freedom: 4972.179
## at cycle: 2
##
## Global Deviance: 12363.06
## AIC: 12376.7
## SBC: 12421.12
## ******************************************************************
How to read the GAMLSS summary:
The output has two sections: mu (the mean model, same idea as the GAM) and sigma (the variance model). In the mu section, look at the signs of the coefficients — positive = that predictor increases average stress, negative = it decreases it. In the sigma section, a significant
pb(work_pressure)means work pressure also changes how much variability there is in stress scores, not just the average level.If
pb(work_pressure)is significant in the sigma equation, it confirms that high work pressure not only raises average stress — it also makes stress scores more unpredictable (some people cope, others do not).
## ******************************************************************
## Summary of the Quantile Residuals
## mean = -7.51951e-07
## variance = 1.000201
## coef. of skewness = -0.001465314
## coef. of kurtosis = 2.042005
## Filliben correlation coefficient = 0.9894097
## ******************************************************************
The GAMLSS diagnostic plots show worm plots and residual checks. A well-fitting model has residuals that follow a normal distribution and show no systematic patterns. If the worm plot is approximately flat (no U or S shapes), the model fits well.
aic.lm <- AIC(fit.lm)
aic.gam <- AIC(fit.gam)
aic.gss <- GAIC(fit.gamlss)
data.frame(
Model = c("Linear Regression", "GAM (mgcv)", "GAMLSS"),
AIC = round(c(aic.lm, aic.gam, aic.gss), 2)
) %>%
arrange(AIC)## Model AIC
## 1 Linear Regression 12375.94
## 2 GAM (mgcv) 12376.12
## 3 GAMLSS 12376.70
AIC (Akaike Information Criterion) compares models — lower is better. It penalises models that are overly complex, so a lower AIC means a better balance between fit and simplicity.
If the GAM AIC is lower than the linear model, it confirms that the nonlinear approach was worth it. If GAMLSS has the lowest AIC, it means allowing the variance to vary improved the model further — supporting the conclusion that work pressure does not just raise average stress, but also increases its variability.
scatter.wp +
geom_smooth(
method = "gam",
formula = y ~ s(x),
color = "#880e4f", fill = "#f48fb1",
alpha = .3, linewidth = 1.4) +
labs(
title = "Nonlinear Association Between Work Pressure and Stress",
subtitle = "GAM smooth curve with 95% confidence band"
)This final plot combines the raw jittered data with the GAM smooth curve. The shaded band is the 95% confidence interval — where the true curve likely lies. The curve bending upward confirms the nonlinear positive relationship between work pressure and stress. The GAM captured this pattern far better than a straight line would have.
Work pressure is positively and nonlinearly associated with stress — as workload demands increase, stress rises, and the effect accelerates at higher levels of pressure.
Autonomy and social support are both protective factors — higher levels of each are associated with lower stress scores. These relationships also show nonlinear patterns, particularly at the lower end of the scale.
The GAM outperformed the linear model (lower AIC, higher deviance explained), confirming that straight-line models were too simple for this data. The GAMLSS further revealed that work pressure also increases the variance of stress — meaning that at high work pressure, some people report extreme stress while others remain relatively calm.
Practical implication: Reducing work pressure is the single most impactful intervention. Supporting employee autonomy and strengthening social support networks are both evidence-based strategies for reducing workplace stress.
Social Support vs Stress