Interaction Effect Review

Table 1. Predicting Social Support from GDP, Happiness, and their Interaction

Model 1Model 2Model 3Model 4
(Intercept)0.00    (0.06)0.00    (0.05)0.00    (0.05)0.08    (0.06)
GDP0.73 ***(0.06)           0.25 ** (0.07)0.25 ** (0.07)
happiness           0.81 ***(0.05)0.62 ***(0.07)0.60 ***(0.07)
happiness:GDP                                 -0.10 *  (0.05)
N140           140           140           140           
R20.53        0.66        0.69        0.70        
All continuous variables are mean-centered and scaled by 1 standard deviation. *** p < 0.001; ** p < 0.01; * p < 0.05.

Not required, but in case you wanted to see :)

h <- read.csv("~/Dropbox/!GRADSTATS/Datasets/World Happiness Report - 2024/World-happiness-report-2024.csv", stringsAsFactors = T)
library(ggplot2)
library(jtools)

## Some data cleaning.
h$GDPcat <- ifelse(scale(h$Log.GDP.per.capita) > sd(h$Log.GDP.per.capita, na.rm = T), "High GDP (+1 SD)", "Low GDP (-1 SD)")
h$GDPcat <- as.factor(h$GDPcat)
plot(h$GDPcat)
h$happiness <- h$Ladder.score
h$GDP <- h$Log.GDP.per.capita

ggplot(data = subset(h, !is.na(h$GDPcat)), aes(x = scale(happiness), y = scale(Social.support), color = GDPcat)) + 
  geom_point(alpha = .5, position = "jitter") +
  geom_smooth(method = "lm") + labs(title = "Check-In Graph") + ylab("Social Support") + xlab("Happiness") +
  theme_apa()

mod1 <- lm(Social.support ~ GDP, data = h)
mod2 <- lm(Social.support ~ happiness, data = h)
mod3 <- lm(Social.support ~ GDP + happiness, data = h)
mod4 <- lm(Social.support ~ happiness * GDP, data = h)

export_summs(mod1, mod2, mod3, mod4, error_pos = "right", digits = 2, scale = T, transform.response = T)

Review : Evaluating Slopes & Models

Model 1Model 2Model 3Model 4
(Intercept)0.00    (0.06)0.00    (0.05)0.00    (0.05)0.08    (0.06)
GDP0.73 ***(0.06)           0.25 ** (0.07)0.25 ** (0.07)
happiness           0.81 ***(0.05)0.62 ***(0.07)0.60 ***(0.07)
happiness:GDP                                 -0.10 *  (0.05)
N140           140           140           140           
R20.53        0.66        0.69        0.70        
All continuous variables are mean-centered and scaled by 1 standard deviation. *** p < 0.001; ** p < 0.01; * p < 0.05.
Res.DfRSSDfSum of SqFPr(>F)
1387.28            
1364.6922.5937.51.06e-13
Res.DfRSSDfSum of SqFPr(>F)
1374.83         
1364.6910.13840.0475
z.happiness 
  0.1890785 
[1] 0.02523017
[1] 0.2032167
[1] 0.07497384
[1] 0.2781906 0.1282429

[1] 0.03010003
[1] 0.01872102
[1] 0.04882105 0.01137902

RECAP : evaluating power

calculating in R (by hand)


Call:
lm(formula = z.Social.support ~ z.happiness * z.GDP, data = h)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.06566 -0.13843  0.00694  0.14130  1.02959 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)        0.03825    0.03062   1.249  0.21367    
z.happiness        0.59678    0.07394   8.071 3.27e-13 ***
z.GDP              0.24570    0.07390   3.325  0.00114 ** 
z.happiness:z.GDP -0.20313    0.10155  -2.000  0.04746 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2787 on 136 degrees of freedom
  (3 observations deleted due to missingness)
Multiple R-squared:  0.696, Adjusted R-squared:  0.6893 
F-statistic: 103.8 on 3 and 136 DF,  p-value: < 2.2e-16
                     Estimate Std. Error   t value     Pr(>|t|)
(Intercept)        0.03825221 0.03061657  1.249396 2.136663e-01
z.happiness        0.59678009 0.07393839  8.071316 3.271081e-13
z.GDP              0.24570006 0.07389917  3.324801 1.137471e-03
z.happiness:z.GDP -0.20313089 0.10154906 -2.000323 4.745671e-02
  • the t-distribution approaches the normal distribution (with a 95% Interval cutoff of 1.96….)
  • but we are not quite there so good to look up what the t-value actually is.
[1] -1.977054
[1] 0.5092651

Using Power to Estimate Sample Size.

  • Power is a function of : effect size, sample size, and the alpha level (alpha = the Type I error that the researcher sets).
  • You can use these terms to estimate the sample size needed for a given power (the convention is often 80%).

Example : Estimating Sample Size

What sample size is needed for a slope of r = .23?

library(pwr)
pwr::pwr.r.test(r = .23, power = .80, alternative = "two.sided")

     approximate correlation power calculation (arctangh transformation) 

              n = 145.2367
              r = 0.23
      sig.level = 0.05
          power = 0.8
    alternative = two.sided
p.ex <- pwr.r.test(r = .23, power = .80, alternative = "two.sided")
plot(p.ex)

BREAK TIME

ACTIVITY : Experiments as Linear Model

The Definition of Causality

  1. The cause and effect are contiguous in space and time.

  2. The cause must be prior to the effect. (no reverse causation)

  3. There must be a constant union betwixt the cause and effect. (“Tis chiefly this quality, that constitutes the relation.”) (no random chance)

  4. The same cause always produces the same effect, and the same effect never arises but from the same cause. (not “just” some third variable)

Manipulation : Watch out for Misleading Control Variables

RECAP : the manipulation (A/B Testing) :

  • researchers create multiple groups (conditions) and change ONE THING (the IV) about a person’s experience in each group & observe the result (the DV).

    • treatment / experimental condition : the IV is present (the change happens)

    • control / comparison condition : the IV is absent (the default experience / no change)

  • KEY IDEA : the comparison group matters!

    • a 3 hour stats class DECREASES boredom compared to…

    • a 3 hour stats class INCREASES boredom compared to…

Making Comparisons

The Manipulation

Anchoring : Question –> Theory –> Data

  • Question : Will the number that people see BEFORE making their own rating influence their decision?

  • Theory :

    • OPTION A: People who see a HIGHER number before making their own rating will make a HIGHER number than people who see the LOWER number.
    • OPTION B : People who see a LOWER number before making their own rating will make a HIGHER number than people who see the HIGHER number.
    • OPTION C : There will be NO DIFFERENCES between the groups.

NEXT TIME.

  • Lab 6. Teaching about regression
  • Article : 10 Common Statistical Mistakes