library(readr)

# Load the dataset
data <- read.csv("/Users/dgkamper/Library/Mobile Documents/com~apple~CloudDocs/Axis - HQ/PhD Terms/Classes/Spring 2024/Psych 250C/Problem Sets/Psych 250C Problems Sets/GodData.csv")

Question 1


You will estimate a mediation model where conflict (War vs. Peace and Poverty, use the WarCond variable) affects reported importance of a punitive god through support for cultural tightness. Define the role of the following variables as either the independent variable (causal antecedent), mediator, or outcome (causal outcome).


library(mediation)
## Loading required package: MASS
## Loading required package: Matrix
## Loading required package: mvtnorm
## Loading required package: sandwich
## mediation: Causal Mediation Analysis
## Version: 4.5.0
# Fit the mediator model
mediator_model <- lm(SCTTotal ~ WarCond, data = data)

# Fit the outcome model
outcome_model <- lm(Punitive ~ WarCond + SCTTotal, data = data)

# Estimate the mediation effect
mediation_result <- mediate(mediator_model, outcome_model, treat = "WarCond", mediator = "SCTTotal")

# Summary of Results
summary(mediation_result)
## 
## Causal Mediation Analysis 
## 
## Quasi-Bayesian Confidence Intervals
## 
##                Estimate 95% CI Lower 95% CI Upper p-value    
## ACME              0.327        0.235         0.43  <2e-16 ***
## ADE               0.356        0.203         0.52  <2e-16 ***
## Total Effect      0.683        0.510         0.86  <2e-16 ***
## Prop. Mediated    0.482        0.346         0.63  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Sample Size Used: 1065 
## 
## 
## Simulations: 1000

Conflict (War vs. Peace and Poverty, WarCond): Independent variable (causal antecedent)

Importance of a punitive god (Punitive): Outcome variable (causal outcome)

Support for cultural tightness (SCTTotal): Mediator

In our case, the independent variable (causal antecedent) is the variable that is manipulated or is considered the cause in the model. In this case, Conflict (operationalized through WarCond) is the condition that is presumed to affect both the mediator and the outcome. WarCond indicates whether the participant was assigned to a “War” condition versus “Peace” or “Poverty.”

The outcome variable (causal outcome) is the variable that is affected by the independent variable and the mediator. In this model, the reported importance of a punitive god (Punitive) is considered the outcome because the hypothesis is that it is influenced by the conflict condition (WarCond) and potentially mediated by support for cultural tightness (SCTTotal).

The mediator is the variable that is influenced by the independent variable and, in turn, influences the outcome variable. In this model, support for cultural tightness (SCTTotal) is considered the mediator because the hypothesis is that WarCond affects SCTTotal, which in turn affects the importance of a punitive god (Punitive). The mediator helps to explain the process or mechanism through which the independent variable influences the outcome.


Question 2


Estimate the model using the data. You can estimate each regression equation separately, or use a program which estimates the model altogether. Later you will need bootstrap CIs for the indirect effect, so you may want to include that specification here. Include your code and output below.


library(mediation)

# Fit the mediator model
mediator_model <- lm(SCTTotal ~ WarCond, data = data)

# Fit the outcome model
outcome_model <- lm(Punitive ~ WarCond + SCTTotal, data = data)

# Estimate the mediation effect
mediation_result <- mediate(mediator_model, outcome_model, treat = "WarCond", mediator = "SCTTotal")

# Summary of Results
summary(mediation_result)
## 
## Causal Mediation Analysis 
## 
## Quasi-Bayesian Confidence Intervals
## 
##                Estimate 95% CI Lower 95% CI Upper p-value    
## ACME              0.327        0.228         0.42  <2e-16 ***
## ADE               0.357        0.203         0.51  <2e-16 ***
## Total Effect      0.684        0.508         0.86  <2e-16 ***
## Prop. Mediated    0.477        0.348         0.64  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Sample Size Used: 1065 
## 
## 
## Simulations: 1000
library(mediation)

# Fit the mediator model
mediator_model <- lm(SCTTotal ~ WarCond, data = data)

# Fit the outcome model
outcome_model <- lm(Punitive ~ WarCond + SCTTotal, data = data)

# Estimate the mediation effect with bootstrap confidence intervals
set.seed(123) # For reproducibility

mediation_result_bootstrap <- mediate(mediator_model, outcome_model, treat = "WarCond", mediator = "SCTTotal", boot = TRUE, sims = 1000)
## Running nonparametric bootstrap
# Summary of Results
summary(mediation_result_bootstrap)
## 
## Causal Mediation Analysis 
## 
## Nonparametric Bootstrap Confidence Intervals with the Percentile Method
## 
##                Estimate 95% CI Lower 95% CI Upper p-value    
## ACME              0.328        0.226         0.43  <2e-16 ***
## ADE               0.359        0.189         0.52  <2e-16 ***
## Total Effect      0.688        0.513         0.87  <2e-16 ***
## Prop. Mediated    0.478        0.344         0.65  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Sample Size Used: 1065 
## 
## 
## Simulations: 1000
library (lavaan)
## This is lavaan 0.6-17
## lavaan is FREE software! Please report any bugs.
# Specify the mediation model
model_1 <- '
  # Direct effect
  Punitive ~ c*WarCond
  
  # Mediator
  SCTTotal ~ a*WarCond
  Punitive ~ b*SCTTotal
  
  # Indirect effect (a*b)
  ab := a*b
  
  # Total effect
  total := c + (a*b)
'

# Fit the model
fit1 <- sem(model_1, data = data)

# Summarize the results
summary(fit1, standardized = TRUE, fit.measures = TRUE)
## lavaan 0.6.17 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
## 
##   Number of observations                          1065
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               414.619
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -3723.069
##   Loglikelihood unrestricted model (H1)      -3723.069
##                                                       
##   Akaike (AIC)                                7456.138
##   Bayesian (BIC)                              7480.991
##   Sample-size adjusted Bayesian (SABIC)       7465.110
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Punitive ~                                                            
##     WarCond    (c)    0.359    0.082    4.372    0.000    0.359    0.115
##   SCTTotal ~                                                            
##     WarCond    (a)    0.694    0.099    6.988    0.000    0.694    0.209
##   Punitive ~                                                            
##     SCTTotal   (b)    0.473    0.025   19.097    0.000    0.473    0.504
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Punitive          1.562    0.068   23.076    0.000    1.562    0.709
##    .SCTTotal          2.387    0.103   23.076    0.000    2.387    0.956
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     ab                0.328    0.050    6.563    0.000    0.328    0.105
##     total             0.688    0.093    7.388    0.000    0.688    0.221
# Extract parameter estimates
parameterEstimates(fit1, ci = TRUE, standardized = TRUE)
library (lavaan)

# Specify the mediation model
model2 <- '
  # Direct effect
  Punitive ~ c*WarCond
  # Mediator
  SCTTotal ~ a*WarCond
  Punitive ~ b*SCTTotal
  # Indirect effect (a*b)
  ab := a*b
  # Total effect
  total := c + (a*b)
'

# Fit the model
fit2 <- sem(model2, data = data, se = "bootstrap", bootstrap = 1000)
## Warning in lav_model_nvcov_bootstrap(lavmodel = lavmodel, lavsamplestats =
## lavsamplestats, : lavaan WARNING: 4 bootstrap runs failed or did not converge.
# Summarize the results
summary(fit2, standardized = TRUE, fit.measures = TRUE)
## lavaan 0.6.17 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
## 
##   Number of observations                          1065
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               414.619
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -3723.069
##   Loglikelihood unrestricted model (H1)      -3723.069
##                                                       
##   Akaike (AIC)                                7456.138
##   Bayesian (BIC)                              7480.991
##   Sample-size adjusted Bayesian (SABIC)       7465.110
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                            Bootstrap
##   Number of requested bootstrap draws             1000
##   Number of successful bootstrap draws             996
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Punitive ~                                                            
##     WarCond    (c)    0.359    0.089    4.051    0.000    0.359    0.115
##   SCTTotal ~                                                            
##     WarCond    (a)    0.694    0.099    6.999    0.000    0.694    0.209
##   Punitive ~                                                            
##     SCTTotal   (b)    0.473    0.027   17.564    0.000    0.473    0.504
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Punitive          1.562    0.067   23.469    0.000    1.562    0.709
##    .SCTTotal          2.387    0.100   23.844    0.000    2.387    0.956
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     ab                0.328    0.051    6.435    0.000    0.328    0.105
##     total             0.688    0.095    7.239    0.000    0.688    0.221
# Extract parameter estimates
parameterEstimates(fit2, ci = TRUE, standardized = TRUE)

Question 3


For each path in the estimated model provide a point estimate, interpretation, and inferential statement. In the inferential statement include all relevant statistics for the inferential test using APA format. Do not copy the statements from the paper, word them using your own language).


3a) The effect of X on M (a-path)

# Fit the mediator model (a-path)
mediator_model <- lm(SCTTotal ~ WarCond, data = data)
summary(mediator_model)
## 
## Call:
## lm(formula = SCTTotal ~ WarCond, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -5.0806 -1.0806 -0.0806  1.0418  3.6133 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.38672    0.05874  91.700  < 2e-16 ***
## WarCond      0.69392    0.09939   6.982 5.13e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.546 on 1063 degrees of freedom
## Multiple R-squared:  0.04384,    Adjusted R-squared:  0.04294 
## F-statistic: 48.74 on 1 and 1063 DF,  p-value: 5.135e-12
# Summarize the results
summary(fit1, standardized = TRUE, fit.measures = TRUE)
## lavaan 0.6.17 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
## 
##   Number of observations                          1065
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               414.619
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -3723.069
##   Loglikelihood unrestricted model (H1)      -3723.069
##                                                       
##   Akaike (AIC)                                7456.138
##   Bayesian (BIC)                              7480.991
##   Sample-size adjusted Bayesian (SABIC)       7465.110
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Punitive ~                                                            
##     WarCond    (c)    0.359    0.082    4.372    0.000    0.359    0.115
##   SCTTotal ~                                                            
##     WarCond    (a)    0.694    0.099    6.988    0.000    0.694    0.209
##   Punitive ~                                                            
##     SCTTotal   (b)    0.473    0.025   19.097    0.000    0.473    0.504
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Punitive          1.562    0.068   23.076    0.000    1.562    0.709
##    .SCTTotal          2.387    0.103   23.076    0.000    2.387    0.956
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     ab                0.328    0.050    6.563    0.000    0.328    0.105
##     total             0.688    0.093    7.388    0.000    0.688    0.221
# Extract parameter estimates
parameterEstimates(fit1, ci = TRUE, standardized = TRUE)

Importantly, we must note the following for our a-path:

X: WarCond (Conflict)

M: SCTTotal (Support for cultural tightness)

i. Estimate:

The coefficient for WarCond in the mediator model (SCTTotal ~ WarCond).

Our estimate is 0.69392. If we look at the lavaan output, we find the output to be 0.694.

ii. Interpretation:

Two people who differ by one unit on WarCond are estimated to differ by 0.69392 units on SCTTotal on average. In other words, two people who differ on the type of the War condition are estimated to differ on 0.69392 units of support for cultural tightness.

iii. Inferential Statement:

The effect of the type of War condition on cultural tightness was significant, \(b\) = 0.69392, \(B\) = 0.209, SE = 0.099, t(1063) = 6.982, p < 0.001, Adjusted - \(R^2\) = 0.04294. It can also be noted as well that the 95% confidence interval of [0.499, 0.889]. Therefore, we can reject the null hypothesis for the a-path that the war condition (WarCond) has no effect on support for cultural tightness (SCTTotal), and further conclude that the war condition significantly increases support for cultural tightness.


3b) The effect of M on Y controlling for X (b-path):

# Fit the outcome model (b-path)
outcome_model <- lm(Punitive ~ WarCond + SCTTotal, data = data)
summary(outcome_model)
## 
## Call:
## lm(formula = Punitive ~ WarCond + SCTTotal, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.2168 -0.8965 -0.0028  0.9006  4.6589 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.70828    0.14192   4.991 7.02e-07 ***
## WarCond      0.35915    0.08227   4.366 1.39e-05 ***
## SCTTotal     0.47339    0.02482  19.070  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.252 on 1062 degrees of freedom
## Multiple R-squared:  0.2914, Adjusted R-squared:  0.2901 
## F-statistic: 218.4 on 2 and 1062 DF,  p-value: < 2.2e-16
# Summarize the results
summary(fit1, standardized = TRUE, fit.measures = TRUE)
## lavaan 0.6.17 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
## 
##   Number of observations                          1065
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               414.619
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -3723.069
##   Loglikelihood unrestricted model (H1)      -3723.069
##                                                       
##   Akaike (AIC)                                7456.138
##   Bayesian (BIC)                              7480.991
##   Sample-size adjusted Bayesian (SABIC)       7465.110
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Punitive ~                                                            
##     WarCond    (c)    0.359    0.082    4.372    0.000    0.359    0.115
##   SCTTotal ~                                                            
##     WarCond    (a)    0.694    0.099    6.988    0.000    0.694    0.209
##   Punitive ~                                                            
##     SCTTotal   (b)    0.473    0.025   19.097    0.000    0.473    0.504
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Punitive          1.562    0.068   23.076    0.000    1.562    0.709
##    .SCTTotal          2.387    0.103   23.076    0.000    2.387    0.956
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     ab                0.328    0.050    6.563    0.000    0.328    0.105
##     total             0.688    0.093    7.388    0.000    0.688    0.221
# Extract parameter estimates
parameterEstimates(fit1, ci = TRUE, standardized = TRUE)

Importantly, we must note the following for our b-path:

X: WarCond (Conflict)

M: SCTTotal (Support for cultural tightness)

Y: Punitive (Importance of a punitive God (higher = more important))

i. Estimate:

The point estimate for the b-path (effect of SCTTotal on Punitive controlling for WarCond) is 0.47339, or if we look at the lavaan output, we find the output to be 0.473.

ii. Interpretation:

Two people who differ by one unit on SCTTotal (support for cultural tightness) are estimated to differ by 0.473 units on Punitive (importance of a punitive god) on average, controlling for WarCond. In other words, for each one-unit increase in support for cultural tightness, there is an associated increase of 0.473 units in the reported importance of a punitive god, after accounting for the effect of the war condition.

iii. Inferential Statement:

Support for cultural tightness had a significant positive effect on the reported importance of a punitive god, controlling for the war condition, \(b\) = 0.473, \(B\) = 0.504, SE = 0.025, t(1062) = 19.070, p < 0.001, Adjusted - \(R^2\) = 0.2901. It can also be noted as well that the 95% confidence interval of [0.425, 0.522]. Therefore, we can reject the null hypothesis for the b-path that support for cultural tightness (SCTTotal) has no effect on the importance of a punitive god (Punitive) when controlling for the war condition (WarCond). We further conclude that increased support for cultural tightness significantly increases the reported importance of a punitive god, even after accounting for the war condition.


3c) The effect of X on Y (c-path):

# Fit the total effect model (c-path)
total_effect_model <- lm(Punitive ~ WarCond, data = data)
summary(total_effect_model)
## 
## Call:
## lm(formula = Punitive ~ WarCond, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9459 -1.1472  0.0541  0.9430  3.7417 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.25830    0.05506  59.179  < 2e-16 ***
## WarCond      0.68764    0.09316   7.381 3.16e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.449 on 1063 degrees of freedom
## Multiple R-squared:  0.04876,    Adjusted R-squared:  0.04786 
## F-statistic: 54.48 on 1 and 1063 DF,  p-value: 3.156e-13
# Summarize the results
summary(fit1, standardized = TRUE, fit.measures = TRUE)
## lavaan 0.6.17 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
## 
##   Number of observations                          1065
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               414.619
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -3723.069
##   Loglikelihood unrestricted model (H1)      -3723.069
##                                                       
##   Akaike (AIC)                                7456.138
##   Bayesian (BIC)                              7480.991
##   Sample-size adjusted Bayesian (SABIC)       7465.110
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Punitive ~                                                            
##     WarCond    (c)    0.359    0.082    4.372    0.000    0.359    0.115
##   SCTTotal ~                                                            
##     WarCond    (a)    0.694    0.099    6.988    0.000    0.694    0.209
##   Punitive ~                                                            
##     SCTTotal   (b)    0.473    0.025   19.097    0.000    0.473    0.504
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Punitive          1.562    0.068   23.076    0.000    1.562    0.709
##    .SCTTotal          2.387    0.103   23.076    0.000    2.387    0.956
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     ab                0.328    0.050    6.563    0.000    0.328    0.105
##     total             0.688    0.093    7.388    0.000    0.688    0.221
# Extract parameter estimates
parameterEstimates(fit1, ci = TRUE, standardized = TRUE)

Importantly, we must note the following for our c-path:

X: WarCond (Conflict)

Y: Punitive (Importance of a punitive God (higher = more important))

i. Estimate:

The point estimate for the c-path (effect of WarCond on Punitive) is 0.688.

ii. Interpretation:

Two people who differ by one unit on the war condition (WarCond) are estimated to differ by 0.688 units in their reported importance of a punitive god (Punitive) on average. Specifically, participants who were in the war condition (WarCond = 1) were 0.688 units more likely to report a higher importance of a punitive god on average than those who were in the peace or poverty conditions.

iii. Inferential Statement:

The war condition had a significant positive effect on the reported importance of a punitive god, \(b\) = 0.688, \(B\) = 0.221, SE = 0.093, t(1063) = 7.39, p < 0.001, Adjusted - \(R^2\) = 0.04786. It can also be noted as well that the 95% confidence interval of [0.505, 0.870]. Therefore, we can reject the null hypothesis for the c-path that the war condition (WarCond) has no effect on the reported importance of a punitive god (Punitive), and further conclude that the war condition significantly increases the reported importance of a punitive god.


3d) The effect of X on Y controlling for M (c’-path)

# Fit the direct effect model
direct_effect_model <- lm(Punitive ~ WarCond + SCTTotal, data = data)

summary(direct_effect_model)
## 
## Call:
## lm(formula = Punitive ~ WarCond + SCTTotal, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.2168 -0.8965 -0.0028  0.9006  4.6589 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.70828    0.14192   4.991 7.02e-07 ***
## WarCond      0.35915    0.08227   4.366 1.39e-05 ***
## SCTTotal     0.47339    0.02482  19.070  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.252 on 1062 degrees of freedom
## Multiple R-squared:  0.2914, Adjusted R-squared:  0.2901 
## F-statistic: 218.4 on 2 and 1062 DF,  p-value: < 2.2e-16
# Summarize the results
summary(fit1, standardized = TRUE, fit.measures = TRUE)
## lavaan 0.6.17 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
## 
##   Number of observations                          1065
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               414.619
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -3723.069
##   Loglikelihood unrestricted model (H1)      -3723.069
##                                                       
##   Akaike (AIC)                                7456.138
##   Bayesian (BIC)                              7480.991
##   Sample-size adjusted Bayesian (SABIC)       7465.110
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Punitive ~                                                            
##     WarCond    (c)    0.359    0.082    4.372    0.000    0.359    0.115
##   SCTTotal ~                                                            
##     WarCond    (a)    0.694    0.099    6.988    0.000    0.694    0.209
##   Punitive ~                                                            
##     SCTTotal   (b)    0.473    0.025   19.097    0.000    0.473    0.504
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Punitive          1.562    0.068   23.076    0.000    1.562    0.709
##    .SCTTotal          2.387    0.103   23.076    0.000    2.387    0.956
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     ab                0.328    0.050    6.563    0.000    0.328    0.105
##     total             0.688    0.093    7.388    0.000    0.688    0.221
# Extract parameter estimates
parameterEstimates(fit1, ci = TRUE, standardized = TRUE)

Importantly, we must note the following for our c’-path:

X: WarCond (Conflict)

M: SCTTotal (Support for cultural tightness)

Y: Punitive (Importance of a punitive God (higher = more important))

i. Estimate:

The point estimate for the c’-path (direct effect of WarCond on Punitive controlling for SCTTotal) is 0.359.

ii. Interpretation:

The rest of the difference, 0.359 units, is due to the direct effect of the war condition (WarCond) on the reported importance of a punitive god (Punitive), independent of support for cultural tightness (SCTTotal). Among those equal in support for cultural tightness, those who were in the war condition were 0.359 units more likely to report a higher importance of a punitive god on average than those in the peace or poverty conditions.

iii. Inferential Statement:

The war condition had a significant positive effect on the reported importance of a punitive god, controlling for support for cultural tightness, \(b\) = 0.359, \(B\) = 0.115, SE = 0.082, t(1062) = 4.372, p < 0.001, Adjusted - \(R^2\) = 0.2901. It can also be noted as well that the 95% confidence interval of [0.198, 0.520]. Therefore, we can reject the null hypothesis for the c’-path that the war condition (WarCond) has no direct effect on the reported importance of a punitive god (Punitive) when controlling for support for cultural tightness. We further conclude that the war condition significantly increases the reported importance of a punitive god, even when accounting for support for cultural tightness.


Question 4


Mediation involves 4 effect estimates (ad in the question above). In this study, which of these effects can we be certain are causal.


For our a-path, this is the effect of the independent variable (WarCond) on the mediator (SCTTotal). After reading the study by Caluori and colleagues, this path can be considered causal because WarCond was randomly assigned. The random assignment ensures that any observed effect on SCTTotal is due to WarCond and not confounded by other variables. In other words, since WarCond is experimentally manipulated, we can infer causality in this path. We can say that changes in WarCond cause changes in SCTTotal.

For our b-path, this is the effect of the mediator (SCTTotal) on the dependent variable (Punitive), controlling for the independent variable (WarCond). This path can be considered causal if (and a big if) there are no unmeasured confounders that affect both the mediator (SCTTotal) and the dependent variable (Punitive). Given that SCTTotal is influenced by the experimentally manipulated WarCond, and assuming that SCTTotal is measured after WarCond, we can only then infer that the mediator has a causal effect on Punitive, conditioned on no unmeasured confounders. However, the acknowledgment of partial mediation suggests that other unmeasured factors may exist. However, given our slides given to the class , we are unable to establish causal order of this path because SCTTotal was not experimentally manipuated, and this path could be influences by unmeasured confounders.

For our c-path, this is the total effect of WarCond (X) on Punitive (Y). This path is causal due to random assignment of WarCond. The total effect captures both the direct and indirect effects of WarCond on Punitive. We can assume causality for this path.

For our c’-path, this is the direct effect of WarCond (X) on Punitive (Y), controlling for SCTTotal (M). This path is causal due to random assignment of WarCond and the inclusion of SCTTotal as a mediator in the model. However, the study suggests that other factors might also play a role, indicating partial mediation. We can, however, given our class slides, that as with the c-path, we can infer causality due to random assignment of WarCond.


Question 5


The indirect effect:


a) Provide an estimate, interpretation, and inferential statement (based on 5,000 bootstrap samples) for the indirect effect.

library(lavaan)

# Specify the mediation model
model2 <- '
  # Direct effect
  Punitive ~ c*WarCond
  # Mediator
  SCTTotal ~ a*WarCond
  Punitive ~ b*SCTTotal
  # Indirect effect (a*b)
  ab := a*b
  # Total effect
  total := c + (a*b)
'

# Fit the model
fit2 <- sem(model2, data = data, se = "bootstrap", bootstrap = 5000)
## Warning in lav_model_nvcov_bootstrap(lavmodel = lavmodel, lavsamplestats =
## lavsamplestats, : lavaan WARNING: 32 bootstrap runs failed or did not converge.
# Summarize the results
summary(fit2, standardized = TRUE, fit.measures = TRUE)
## lavaan 0.6.17 ended normally after 1 iteration
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         5
## 
##   Number of observations                          1065
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Model Test Baseline Model:
## 
##   Test statistic                               414.619
##   Degrees of freedom                                 3
##   P-value                                        0.000
## 
## User Model versus Baseline Model:
## 
##   Comparative Fit Index (CFI)                    1.000
##   Tucker-Lewis Index (TLI)                       1.000
## 
## Loglikelihood and Information Criteria:
## 
##   Loglikelihood user model (H0)              -3723.069
##   Loglikelihood unrestricted model (H1)      -3723.069
##                                                       
##   Akaike (AIC)                                7456.138
##   Bayesian (BIC)                              7480.991
##   Sample-size adjusted Bayesian (SABIC)       7465.110
## 
## Root Mean Square Error of Approximation:
## 
##   RMSEA                                          0.000
##   90 Percent confidence interval - lower         0.000
##   90 Percent confidence interval - upper         0.000
##   P-value H_0: RMSEA <= 0.050                       NA
##   P-value H_0: RMSEA >= 0.080                       NA
## 
## Standardized Root Mean Square Residual:
## 
##   SRMR                                           0.000
## 
## Parameter Estimates:
## 
##   Standard errors                            Bootstrap
##   Number of requested bootstrap draws             5000
##   Number of successful bootstrap draws            4968
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   Punitive ~                                                            
##     WarCond    (c)    0.359    0.087    4.109    0.000    0.359    0.115
##   SCTTotal ~                                                            
##     WarCond    (a)    0.694    0.098    7.050    0.000    0.694    0.209
##   Punitive ~                                                            
##     SCTTotal   (b)    0.473    0.028   16.910    0.000    0.473    0.504
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .Punitive          1.562    0.069   22.590    0.000    1.562    0.709
##    .SCTTotal          2.387    0.095   25.238    0.000    2.387    0.956
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     ab                0.328    0.051    6.446    0.000    0.328    0.105
##     total             0.688    0.095    7.237    0.000    0.688    0.221
# Extract parameter estimates
parameterEstimates(fit2, ci = TRUE, standardized = TRUE)

Estimate: The point estimate for the indirect effect (a*b) is 0.328.

Interpretation: Two people who differ by one unit on the war condition (WarCond) are estimated to differ by 0.328 units in their reported importance of a punitive god (Punitive) on average as a result of differences in support for cultural tightness (SCTTotal), which in turn affects the reported importance of a punitive god.

Inferential Statement: Based on 5,000 bootstrap samples, the indirect effect of the war condition on the importance of a punitive god through support for cultural tightness was significant, \(ab\) = 0.328, SE = 0.050, t(1064) = 6.279, p < 0.001. It can also be noted as well that the 95% confidence interval of [0.229, 0.435]. I can use the t-test instead because if the sample size is large, one can report the z-value as a t-value (I remember this from 250A).


b) Show that in this data the total effect is equal to the direct effect plus the indirect effect in this sample.


Consider the following note:


Question 6


Based on the results of this analysis is there strong evidence that support for cultural tightness causes feelings of the importance of a punitive God?


Based on the results of this mediation analysis, there is strong evidence that support for cultural tightness causes feelings of the importance of a punitive God. The direct effect of support for cultural tightness on the importance of a punitive God is significant and substantial. The large t-value and the small p-value (< .001) provide strong evidence that support for cultural tightness significantly affects feelings of the importance of a punitive God. The significant indirect effect indicates that the war condition influences feelings of the importance of a punitive God through its impact on support for cultural tightness. This mediation pathway further supports the role of cultural tightness in shaping these religious feelings. The total effect being significant confirms that the war condition overall influences feelings of the importance of a punitive God, with cultural tightness being an important mediator in this relationship.


Question 7


Use a piecewise linear model to examine the relationship between SCTTotal and Punitive, with a joint/breakpoint at 4. Based on the analysis, what is the relationship between of SCTTotal and Punitive before and after the breakpoint? Based on this model, can we conclude that it fits better than a SLR model where SCTTotal predicts Punitive? Create two plots: a plot with the predicted Y on the Y-axis and the predictor on the X-axis, and a plot with the residuals from the model on the Y-axis and the predictor on the X-axis.


a) Use a piecewise linear model to examine the relationship between SCTTotal and Punitive, with a joint/breakpoint at 4. Based on the analysis, what is the relationship between of SCTTotal and Punitive before and after the breakpoint? (report all inferential information required by APA i.e., t(df) = t-value, p = p-value).


# Fit the simple linear regression model
slr_model <- lm(Punitive ~ SCTTotal, data = data)

summary(slr_model)
## 
## Call:
## lm(formula = Punitive ~ SCTTotal, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.0596 -0.8951  0.0016  0.8556  4.6256 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.70599    0.14312   4.933 9.39e-07 ***
## SCTTotal     0.49608    0.02448  20.266  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.262 on 1063 degrees of freedom
## Multiple R-squared:  0.2787, Adjusted R-squared:  0.278 
## F-statistic: 410.7 on 1 and 1063 DF,  p-value: < 2.2e-16
# Create variable for the piecewise linear model
data$SCTTotal_segment <- ifelse(data$SCTTotal > 4, data$SCTTotal - 4, 0)

# Fit  piecewise linear model
piecewise_model <- lm(Punitive ~ SCTTotal + SCTTotal_segment, data = data)

summary(piecewise_model)
## 
## Call:
## lm(formula = Punitive ~ SCTTotal + SCTTotal_segment, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.1756 -0.9319  0.0047  0.8894  4.0967 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.6491     0.3956   4.168 3.32e-05 ***
## SCTTotal           0.2324     0.1060   2.193   0.0285 *  
## SCTTotal_segment   0.3091     0.1209   2.556   0.0107 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.259 on 1062 degrees of freedom
## Multiple R-squared:  0.2831, Adjusted R-squared:  0.2818 
## F-statistic: 209.7 on 2 and 1062 DF,  p-value: < 2.2e-16

Inferential Statement/Estimate (Before the Breakpoint):

The relationship between support for cultural tightness and the reported importance of a punitive god before the breakpoint (SCTTotal less than 4) is significant, \(b\) =0.2324, SE = 0.1076, t(1062)=2.193, p= 0.0285. This indicates that two people who differ by one unit on support for cultural tightness (SCTTotal) are estimated to differ by 0.2324 units in their reported importance of a punitive god (Punitive) on average, when SCTTotal is 4 or less. For example, participants who scored 4 units higher on SCTTotal reported a 0.2324 unit higher importance of a punitive god on average than those who scored lower.

Inferential Statement/Estimate (After the Breakpoint):

The relationship between support for cultural tightness and the reported importance of a punitive god after the breakpoint (SCTTotal > 4) is significant, \(b\) =0.3091, SE = 0.1209, t(1062)=2.556, p= 0.0107. This indicates that two people who differ by one unit on support for cultural tightness (SCTTotal) are estimated to differ by 0.3091 units in their reported importance of a punitive god (Punitive) on average, when SCTTotal is greater than 4. For example, participants who scored 4 units higher on SCTTotal reported a 0.3091 unit higher importance of a punitive god on average than those who scored lower.

These results indicate that the relationship between support for cultural tightness and the reported importance of a punitive god is significant both before and after the breakpoint at SCTTotal = 4, with a stronger effect observed after the breakpoint.


b. Based on this model, can we conclude that it fits better than a SLR model where SCTTotal predicts Punitive? (Include an inferential test with associated statistics to support your claim.)


# Fit the simple linear regression model
slr_model <- lm(Punitive ~ SCTTotal, data = data)

summary(slr_model)
## 
## Call:
## lm(formula = Punitive ~ SCTTotal, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.0596 -0.8951  0.0016  0.8556  4.6256 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.70599    0.14312   4.933 9.39e-07 ***
## SCTTotal     0.49608    0.02448  20.266  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.262 on 1063 degrees of freedom
## Multiple R-squared:  0.2787, Adjusted R-squared:  0.278 
## F-statistic: 410.7 on 1 and 1063 DF,  p-value: < 2.2e-16
# Create variable for the piecewise linear model
data$SCTTotal_segment <- ifelse(data$SCTTotal > 4, data$SCTTotal - 4, 0)

# Fit  piecewise linear model
piecewise_model <- lm(Punitive ~ SCTTotal + SCTTotal_segment, data = data)

summary(piecewise_model)
## 
## Call:
## lm(formula = Punitive ~ SCTTotal + SCTTotal_segment, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.1756 -0.9319  0.0047  0.8894  4.0967 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.6491     0.3956   4.168 3.32e-05 ***
## SCTTotal           0.2324     0.1060   2.193   0.0285 *  
## SCTTotal_segment   0.3091     0.1209   2.556   0.0107 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.259 on 1062 degrees of freedom
## Multiple R-squared:  0.2831, Adjusted R-squared:  0.2818 
## F-statistic: 209.7 on 2 and 1062 DF,  p-value: < 2.2e-16
# Compare the two models using ANOVA
anova_comparison <- anova(slr_model, piecewise_model)

# Print the comparison
anova_comparison

Based on these ANOVA results, we find that the piecewise linear model does fits the data significantly better than the simple linear regression model F(1, 1062) = 6.533, p = 0.01073. This indicates that incorporating a breakpoint at SCTTotal = 4 provides a significantly better fit for predicting the importance of a punitive god based on support for cultural tightness.


c) Create a scatter plot with the predicted outcome and predictor


# Add predicted values and residuals to the data frame
data$Predicted <- predict(piecewise_model)
data$Residuals <- resid(piecewise_model)

# Add predicted values from SLR model
data$SLR_Predicted <- predict(slr_model)
library(ggplot2)

# Predicted values vs. SCTTotal
predicted_plot <- ggplot(data, aes(x = SCTTotal, y = Predicted)) +
  geom_point() +
  geom_line(color = "blue") +
  labs(title = "Predicted Values vs. SCTTotal",
       x = "SCTTotal",
       y = "Predicted Punitive") +
  theme_minimal()+
  theme(
    axis.text = element_text(size = 10, face = "plain"),
    axis.title = element_text(size = 11, face = "plain"),
    plot.title = element_text(size = 13, face = "plain", hjust = 0.5)
  ) 

print(predicted_plot)

library (ggplot2)

# Plot Both Models for Comparison
comparison_plot <- ggplot(data, aes(x = SCTTotal)) +
  geom_point(aes(y = Punitive), alpha = 0.5) +
  geom_line(aes(y = Predicted), color = "blue", linewidth = 1) +
  geom_line(aes(y = SLR_Predicted), color = "red", linewidth = 1, linetype = "dashed") +
  labs(title = "Comparison of Piecewise Linear Model and SLR",
       x = "SCTTotal",
       y = "Punitive",
       color = "Model") +
  theme_minimal()+
  theme(
    axis.text = element_text(size = 10, face = "plain"),
    axis.title = element_text(size = 11, face = "plain"),
    plot.title = element_text(size = 13, face = "plain", hjust = 0.5)
  ) 

print(comparison_plot)

In our case here, the blue bine represents the predicted values from the piecewise linear model. The red dashed line represents the predicted values from the simple linear regression (SLR) model. The relationship looks different when comparing the piecewise linear model to the simple linear regression model.

Before the breakpoint (SCTTotal less than or equal 4), the blue line shows a less steep increase in Punitive with increasing SCTTotal. This indicates a slower rate of change in the importance of a punitive god for lower levels of cultural tightness. After the breakpoint (SCTTotal greater than 4), the blue line shows a steeper increase in Punitive with increasing SCTTotal. This indicates a faster rate of change in the importance of a punitive god for higher levels of cultural tightness. The red dashed line shows a single, consistent slope across the entire range of SCTTotal values. This model suggests a constant rate of change in the importance of a punitive god with increasing cultural tightness, without accounting for different rates of change at different levels of SCTTotal.

One can see here that the piecewise linear model accounts for a change in the slope at the breakpoint (SCTTotal = 4). This is visually evident as the blue line becomes steeper after SCTTotal exceeds 4, indicating a stronger relationship between SCTTotal and Punitive at higher levels of SCTTotal. Furthermore, the blue line from the piecewise model appears to fit the data better, especially around the breakpoint, where the relationship between SCTTotal and Punitive changes. The red dashed line from the SLR model does not capture this change and provides a less accurate fit. This is demonstrated by the ANOVA results, which indicate that the piecewise linear model fits the data significantly better than the simple linear regression model.


d) Create a scatter plot with the residuals and the predictor.


library(ggplot2)

# Residuals vs. SCTTotal
residuals_plot <- ggplot(data, aes(x = SCTTotal, y = Residuals)) +
  geom_point() +
  geom_hline(yintercept = 0, color = "red", linetype = "dashed") +
  labs(title = "Residuals vs. SCTTotal for Piecewise Regression",
       x = "SCTTotal",
       y = "Residuals") +
  theme_minimal()+
  theme(
    axis.text = element_text(size = 10, face = "plain"),
    axis.title = element_text(size = 11, face = "plain"),
    plot.title = element_text(size = 13, face = "plain", hjust = 0.5)
  ) 

print(residuals_plot)

The residuals should be randomly scattered around the horizontal line at zero. This indicates that the model’s predictions are unbiased across the range of predictor values (SCTTotal). Howevever, There seems to be a systematic curve under 5.0 on SCTTotal. This variation almost shows a bit of a coning out, which is concerning because such patterns suggest that the relationship between the predictor and the outcome is not adequately captured by the linear model. In this regard, I think that the residuals appear to be randomly distributed around zero, but do show obvious systematic patterns, which rejects the linearity assumption for this model.


Question 8


Use a second degree polynomial to examine the relationship between SCTTotal and Punitive. Create a new variable which is SCTTotal2, and include it as a predictor in the model. Based on the analysis, what is the effect of SCTTotal on Punitive and is it statistically significant? Create two plots: a plot with the predicted Y on the Y-axis and the predictor on the X-axis, and a plot with the residuals from the SLR model on the Y-axis and the predictor on the X-axis.


a) Estimate a linear model predicting Punitive from SCTTotal and SCT-squared. Report the effect of SCTTotal on Punitive using a test on the set of the two variables predicting the outcome. Report a significance test and an estimate which represents “how much” SCTTotal predicts Punitive (report all inferential information required by APA i.e., t(df) = t-value, p = p-value).


# Create the squared term
data$SCTTotal2 <- data$SCTTotal^2

# Fit the polynomial regression model
poly_model <- lm(Punitive ~ SCTTotal + SCTTotal2, data = data)

summary(poly_model)
## 
## Call:
## lm(formula = Punitive ~ SCTTotal + SCTTotal2, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.1739 -0.8951  0.0002  0.8826  4.4528 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  1.06930    0.37174   2.877  0.00410 **
## SCTTotal     0.35505    0.13541   2.622  0.00886 **
## SCTTotal2    0.01260    0.01189   1.059  0.28986   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.262 on 1062 degrees of freedom
## Multiple R-squared:  0.2795, Adjusted R-squared:  0.2781 
## F-statistic: 205.9 on 2 and 1062 DF,  p-value: < 2.2e-16

For our linear term, we find that the effect of SCTTotal on Punitive is significant, \(b\) =0.35585, SE = 0.13541, t(1062)=2.627, p= 0.00860. For our quadratic term, we find that the effect of SCTTotal^2 on Punitive is not significant, \(b\) =0.01260, SE = 0.01189, t(1062)=1.059, p= 0.28950. Lastly, the polynomial regression model, including both SCTTotal and SCTTotal^2, significantly predicts Punitive, F(2,1062) = 205.9, p < 0.001.

The polynomial regression model suggests that while there is a significant linear relationship between SCTTotal and Punitive, the quadratic term (SCTTotal^2) does not significantly contribute to the prediction of Punitive. Thus, the relationship between SCTTotal and Punitive is primarily linear based on this analysis, even though the model fit indicates some improvement over a purely linear model.


b) Create a scatter plot with the predicted outcome and predictor.


# Add predicted and residual values from the polynomial model to the data
data$Predicted_Poly <- predict(poly_model)
data$Residuals_Poly <- resid(poly_model)
library(ggplot2)

# Create the scatter plot with the predicted outcome and predictor
poly_scatter_plot <- ggplot(data, aes(x = SCTTotal, y = Punitive)) +
  geom_point(alpha = 0.5) +
  geom_line(aes(y = Predicted_Poly), color = "blue", linewidth = 1) + 
  labs(title = "Polynomial Regression: Predicted Punitive vs. SCTTotal",
       x = "SCTTotal",
       y = "Predicted Punitive") +
  theme_minimal()+
  theme(
    axis.text = element_text(size = 10, face = "plain"),
    axis.title = element_text(size = 11, face = "plain"),
    plot.title = element_text(size = 13, face = "plain", hjust = 0.5)
  ) 

print(poly_scatter_plot)

library(ggplot2)

# Create comparison plot for predicted outcome and predictor from both models
comparison_plot_poly_slr <- ggplot(data, aes(x = SCTTotal, y = Punitive)) +
  geom_point(alpha = 0.5) +  # Actual data points
  geom_line(aes(y = SLR_Predicted), color = "red", size = 1, linetype = "dashed") +  # Predicted values from SLR
  geom_line(aes(y = Predicted_Poly), color = "blue", size = 1) +  # Predicted values from polynomial model
  labs(title = "Comparison of Polynomial Regression and SLR",
       x = "SCTTotal",
       y = "Predicted Punitive") +
  theme_minimal()+
  theme(
    axis.text = element_text(size = 10, face = "plain"),
    axis.title = element_text(size = 11, face = "plain"),
    plot.title = element_text(size = 13, face = "plain", hjust = 0.5)
  ) 
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
print(comparison_plot_poly_slr)

Yes, the relationship looks different when comparing the polynomial regression model to the simple linear regression model. The blue line represents the polynomial regression model, which includes both the linear term (SCTTotal) and the quadratic term (SCTTotal^2). This model captures well the curvature in the relationship between SCTTotal and Punitive, especially at the lower and higher ends of SCTTotal. The red dashed line represents the simple linear regression model, which assumes a linear relationship between SCTTotal and Punitive. This model does not show an account for any curvature and predicts a constant rate of change across the entire range of SCTTotal.

Therefore, the polynomial regression model captures a slight curvature in the relationship. This is more pronounced at the extremes of SCTTotal (both low and high values), where the blue line deviates from the red dashed line. Moreover, the polynomial regression model provides a better fit to the data points, especially where there is more spread in the values of Punitive at higher levels of SCTTotal. Although the quadratic term was not statistically significant, the polynomial model still provides a different and potentially better fit than the linear model, indicating that the relationship may not be purely linear.


c) Create a scatter plot with the residuals and the predictor.


library(ggplot2)

# Residual Poly Plot
residuals_plot_poly <- ggplot(data, aes(x = SCTTotal, y = Residuals_Poly)) +
  geom_point() + 
  geom_hline(yintercept = 0, color = "red", linetype = "dashed") +
  labs(title = "Residuals vs. SCTTotal for Polynomial Regression",
       x = "SCTTotal",
       y = "Residuals") +
  theme_minimal()+
  theme(
    axis.text = element_text(size = 10, face = "plain"),
    axis.title = element_text(size = 11, face = "plain"),
    plot.title = element_text(size = 13, face = "plain", hjust = 0.5)
  ) 


print(residuals_plot_poly)

The residuals should be randomly scattered around the horizontal line at zero. This indicates that the model’s predictions are unbiased across the range of predictor values (SCTTotal). Howevever, There seems to be a systematic curve under 5.0 on SCTTotal. This variation almost shows a bit of a coning out, which is concerning because such patterns suggest that the relationship between the predictor and the outcome is not adequately captured by the linear model. In this regard, I think that the residuals appear to be randomly distributed around zero, but do show obvious systematic patterns, which rejects the linearity assumption for this model.