Preliminaries ————————————————————

1) Use the “install.packages” function to install the “rockchalk” and “DAAG” packages.

2) Use the “library” function to load the “rockchalk” and “DAAG” packages.

3) Use the “readRDS” function to load the “msq2.rds” dataset into your workspace.

4) Use the “data” function to load the “cps3” and “leafshape” datasets into your workspace.


Continuous Variable Moderation ——————————————————————

Use the “msq2” data to complete the following:

1a) Estimate a model that tests if the effect of Energetic Arousal (EA) on Tense Arousal (TA) varies as a function of Negative Affect (NegAff), after controlling for Positive Affect (PA).


Call:
lm(formula = TA ~ PA + NegAff * EA, data = msq2)

Residuals:
     Min       1Q   Median       3Q      Max 
-11.0925  -2.4258  -0.0154   2.6029  12.1184 

Coefficients:
             Estimate Std. Error t value             Pr(>|t|)    
(Intercept)  7.948168   0.424510  18.723 < 0.0000000000000002 ***
PA          -0.200275   0.042801  -4.679       0.000003721018 ***
NegAff       0.460938   0.078850   5.846       0.000000009159 ***
EA           0.297818   0.045878   6.491       0.000000000207 ***
NegAff:EA    0.019562   0.006173   3.169              0.00162 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.865 on 495 degrees of freedom
Multiple R-squared:  0.4297,    Adjusted R-squared:  0.425 
F-statistic: 93.22 on 4 and 495 DF,  p-value: < 0.00000000000000022

1b) What is the value of the parameter estimate that quantifies the effect of Negative Affect on the Energetic Arousal -> Tense Arousal effect, after controlling for Positive Affect?

0.0196 is the value of the parameter estimate that quantifies the effect of Negative Affect on the Energetic Arousal -> Tense Arousal effect, after controlling for Positive Affect.

1c) Does Negative Affect significantly moderate (at the alpha = 0.05 level) the relationship between Energetic Arousal and Tense Arousal, after controlling for Positive Affect?

Negative Affect DOES significantly moderate (at alpha = 0.05) the relationship between Energetic Arousal and Tense Arousal [t-statistic = 3.1692, p-value = 0.0016] after controlling for Positive Affect.

1d) After controlling for Positive Affect, how does Negative Affect impact the relationship between Energetic Arousal and Tense Arousal? Provide a sentence interpreting the appropriate effect.

After controlling for Positive Affect, for each increasing unit of Negative Affect, the effect of Energetic Arousal on Tense Arousal will change 0.0196

2a) Use the centering method to test the simple slopes of the model you estimated in (1a) at Negative Affect values of 0, 10, and 20.


Call:
lm(formula = TA ~ PA + NegAff0 * EA, data = msq2)

Residuals:
     Min       1Q   Median       3Q      Max 
-11.0925  -2.4258  -0.0154   2.6029  12.1184 

Coefficients:
             Estimate Std. Error t value             Pr(>|t|)    
(Intercept)  7.948168   0.424510  18.723 < 0.0000000000000002 ***
PA          -0.200275   0.042801  -4.679       0.000003721018 ***
NegAff0      0.460938   0.078850   5.846       0.000000009159 ***
EA           0.297818   0.045878   6.491       0.000000000207 ***
NegAff0:EA   0.019562   0.006173   3.169              0.00162 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.865 on 495 degrees of freedom
Multiple R-squared:  0.4297,    Adjusted R-squared:  0.425 
F-statistic: 93.22 on 4 and 495 DF,  p-value: < 0.00000000000000022

Call:
lm(formula = TA ~ PA + NegAff10 * EA, data = msq2)

Residuals:
     Min       1Q   Median       3Q      Max 
-11.0925  -2.4258  -0.0154   2.6029  12.1184 

Coefficients:
             Estimate Std. Error t value             Pr(>|t|)    
(Intercept) 12.557552   0.624513  20.108 < 0.0000000000000002 ***
PA          -0.200275   0.042801  -4.679        0.00000372102 ***
NegAff10     0.460938   0.078850   5.846        0.00000000916 ***
EA           0.493442   0.056665   8.708 < 0.0000000000000002 ***
NegAff10:EA  0.019562   0.006173   3.169              0.00162 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.865 on 495 degrees of freedom
Multiple R-squared:  0.4297,    Adjusted R-squared:  0.425 
F-statistic: 93.22 on 4 and 495 DF,  p-value: < 0.00000000000000022

Call:
lm(formula = TA ~ PA + NegAff20 * EA, data = msq2)

Residuals:
     Min       1Q   Median       3Q      Max 
-11.0925  -2.4258  -0.0154   2.6029  12.1184 

Coefficients:
             Estimate Std. Error t value             Pr(>|t|)    
(Intercept) 17.166936   1.357674  12.644 < 0.0000000000000002 ***
PA          -0.200275   0.042801  -4.679       0.000003721018 ***
NegAff20     0.460938   0.078850   5.846       0.000000009159 ***
EA           0.689067   0.109260   6.307       0.000000000632 ***
NegAff20:EA  0.019562   0.006173   3.169              0.00162 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.865 on 495 degrees of freedom
Multiple R-squared:  0.4297,    Adjusted R-squared:  0.425 
F-statistic: 93.22 on 4 and 495 DF,  p-value: < 0.00000000000000022

2b) After controlling for Positive Affect, what is the simple slope of Energetic Arousal on Tense Arousal when Negative Affect is 0.

After controlling for Positive Affect, 0.2978 is the simple slope of Energetic Arousal on Tense Arousal when Negative Affect is 0.

2c) Is the simple slope you estimated in (2b) statistically significant at the alpha = 0.05 level?

The simple slope estimated IS statistically significantl (at alpha = 0.05) [t-statistic = 6.4915, p-value = 0.0000].

2d) After controlling for Positive Affect, what is the simple slope of Energetic Arousal on Tense Arousal when Negative Affect is 10.

After controlling for Positive Affect, 0.4934 is the simple slope of Energetic Arousal on Tense Arousal when Negative Affect is 10.

2e) Is the simple slope you estimated in (2d) statistically significant at the alpha = 0.05 level?

The simple slope estimated IS statistically significantl (at alpha = 0.05) [t-statistic = 8.7080, p-value = 0.0000].

2f) After controlling for Positive Affect, what is the simple slope of Energetic Arousal on Tense Arousal when Negative Affect is 20.

After controlling for Positive Affect, 0.6891 is the simple slope of Energetic Arousal on Tense Arousal when Negative Affect is 20.

2g) Is the simple slope you estimated in (2f) statistically significant at the alpha = 0.05 level?

The simple slope estimated IS statistically significantl (at alpha = 0.05) [t-statistic = 6.3067, p-value = 0.0000].

3a) Use the ‘rockchalk’ package to test the same simple slopes you estimated in (2a).

Values of NegAff OUTSIDE this interval:
        lo         hi 
-46.539311  -7.348106 
cause the slope of (b1 + b2*NegAff)EA to be statistically significant
  "NegAff"     slope Std. Error  t value                  Pr(>|t|)
1        0 0.2978176 0.04587823 6.491479 0.00000000020685613350671
2       10 0.4934423 0.05666539 8.708001 0.00000000000000004642129
3       20 0.6890670 0.10925962 6.306693 0.00000000063240869804194

3b) Do the results of the centering approach agree with the results from ‘rockchalk’?

Value at 0: Same result (0.2978 = 0.2978) 
Value at 10: Same result (0.4934 = 0.4934) 
Value at 20: Same result (0.6891 = 0.6891) 

4a) Use the ‘rockchalk’ package to implement a Johnson-Neyman analysis of the interaction you estimated in (1a).

Values of NegAff OUTSIDE this interval:
        lo         hi 
-46.539311  -7.348106 
cause the slope of (b1 + b2*NegAff)EA to be statistically significant

4b) What are the boundaries of the Johnson-Neyman region of significance?

-46.5393  -7.3481 are the boundaries of the Johnson-Neyman region of significance.

4c) Where in the distribution of Negative Affect is the effect of Energetic Arousal on Tense Arousal (controlling for Positive Affect) statistically significant?

Values of NegAff OUTSIDE the interval [-46.5393 -7.3481] in the distribution of Negative Affect cause the effect of Energetic Arousal on Tense Arousal (controlling for Positive Affect) statistically significant.

Binary Categorical Moderators ——————————————————————

Use the “cps3” data to complete the following:

1a) Estimate a model that tests if the effect of Years of Education on Real Earnings in 1975 is significantly moderated by being Hispanic, after controlling for Real Earnings in 1974. HINT: The Hispanic variable is not a factor. You may want to change that.


Call:
lm(formula = re75 ~ re74 + educ * hisp, data = cps3)

Residuals:
    Min      1Q  Median      3Q     Max 
-7559.8 -1275.5  -764.7  1120.8 12253.4 

Coefficients:
               Estimate Std. Error t value             Pr(>|t|)    
(Intercept)   860.36260  580.57951   1.482              0.13911    
re74            0.25433    0.02016  12.617 < 0.0000000000000002 ***
educ            8.69986   54.21959   0.160              0.87260    
hispyes      3281.79926 1194.32633   2.748              0.00626 ** 
educ:hispyes -299.77060  122.77272  -2.442              0.01503 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2796 on 424 degrees of freedom
Multiple R-squared:  0.2854,    Adjusted R-squared:  0.2786 
F-statistic: 42.33 on 4 and 424 DF,  p-value: < 0.00000000000000022

1b) After controlling for 1974 Earnings, does being Hispanic significantly affect the relationship between Years of Education and 1975 Earnings at the alpha = 0.05 level

After controlling for 1974 Earnings, being Hispanic DOES significantly affect the relationship between Years of Education and 1975 Earnings at the alpha = 0.05 level [t = -2.4417, p-value = 0.0150].

1c) After controlling for 1974 Earnings, does being Hispanic significantly affect the relationship between Years of Education and 1975 Earnings at the alpha = 0.01 level?

After controlling for 1974 Earnings, being Hispanic DOES NOT significantly affect the relationship between Years of Education and 1975 Earnings at the alpha = 0.01 level [t = -2.4417, p-value = 0.0150].

2a) What is the simple slope of Years of Education on 1975 Earnings (controlling for 1974 Earnings) for Non-Hispanic people?

8.6999 is the simple slope of Years of Education on 1975 Earnings (controlling for 1974 Earnings) for Non-Hispanic people

2b) Is the simple slope from (2a) statistically significant at the alpha = 0.05 level?

The simple slope IS NOT statistically significant at the alpha = 0.05 level [t = 0.1605, p-value = 0.8726].

2c) What is the simple slope of Years of Education on 1975 Earnings (controlling for 1974 Earnings) for Hispanic people?


Call:
lm(formula = re75 ~ re74 + educ * hisp2, data = cps3)

Residuals:
    Min      1Q  Median      3Q     Max 
-7559.8 -1275.5  -764.7  1120.8 12253.4 

Coefficients:
                Estimate  Std. Error t value             Pr(>|t|)    
(Intercept)   4142.16186  1045.88024   3.960            0.0000877 ***
re74             0.25433     0.02016  12.617 < 0.0000000000000002 ***
educ          -291.07074   110.37891  -2.637              0.00867 ** 
hisp2no      -3281.79926  1194.32633  -2.748              0.00626 ** 
educ:hisp2no   299.77060   122.77272   2.442              0.01503 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2796 on 424 degrees of freedom
Multiple R-squared:  0.2854,    Adjusted R-squared:  0.2786 
F-statistic: 42.33 on 4 and 424 DF,  p-value: < 0.00000000000000022
-291.0707 is the simple slope of Years of Education on 1975 Earnings (controlling for 1974 Earnings) for Hispanic people

2d) Is the simple slope from (2c) statistically significant at the alpha = 0.05 level?

The simple slope IS statistically significant at the alpha = 0.05 level [t = -2.6370, p-value = 0.0087].

2e) Visualize the simple slopes compute above in an appropriate way.

These are the straight-line "simple slopes" of the variable educ  
 for the selected moderator values. 
         "hisp2"       slope Std. Error    t value    Pr(>|t|)
yes         educ -291.070740  110.37891 -2.6370140 0.008671237
no  educ:hisp2no    8.699864   54.21959  0.1604561 0.872598226

Nominal Categorical Moderators ——————————————————————

Use the “leafshape” data to complete the following:

1a) What are the levels of the “location” factor?

Sabah, Panama, Costa Rica, N Queensland, S Queensland, Tasmania are the levels of the "location" factor.

1b) What are the group sizes for the “location” factor?


       Sabah       Panama   Costa Rica N Queensland S Queensland     Tasmania 
          80           55           50           61           31            9 

2a) Estimate a model that tests if the effect of Leaf Width on Leaf Length differs significantly between Locations.


Call:
lm(formula = bladelen ~ bladewid * location, data = leafshape)

Residuals:
    Min      1Q  Median      3Q     Max 
-22.595  -3.064  -0.604   1.786  38.759 

Coefficients:
                              Estimate Std. Error t value             Pr(>|t|)
(Intercept)                     8.7507     1.3995   6.253 0.000000001541346760
bladewid                        1.5609     0.1821   8.572 0.000000000000000754
locationPanama                  0.8370     1.7985   0.465              0.64202
locationCosta Rica              4.8247     1.8625   2.590              0.01010
locationN Queensland           -3.4453     1.9969  -1.725              0.08559
locationS Queensland           -2.5828     1.8938  -1.364              0.17374
locationTasmania               -9.8918     3.5869  -2.758              0.00621
bladewid:locationPanama        -0.3817     0.2147  -1.778              0.07659
bladewid:locationCosta Rica    -0.5918     0.2087  -2.835              0.00492
bladewid:locationN Queensland   0.3061     0.2974   1.029              0.30424
bladewid:locationS Queensland  -0.6025     0.2354  -2.559              0.01103
bladewid:locationTasmania       2.5273     1.6362   1.545              0.12360
                                 
(Intercept)                   ***
bladewid                      ***
locationPanama                   
locationCosta Rica            *  
locationN Queensland          .  
locationS Queensland             
locationTasmania              ** 
bladewid:locationPanama       .  
bladewid:locationCosta Rica   ** 
bladewid:locationN Queensland    
bladewid:locationS Queensland *  
bladewid:locationTasmania        
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.438 on 274 degrees of freedom
Multiple R-squared:  0.6598,    Adjusted R-squared:  0.6461 
F-statistic: 48.31 on 11 and 274 DF,  p-value: < 0.00000000000000022

2b) Does the effect of Leaf Width on Leaf Length differ significantly (alpha = 0.05) between Locations?

The effect of Leaf Width on Leaf Length DOES differ significantly (alpha = 0.05) between Loations [F = 48.3078, p-value = 0.0000].

2c) What is the value of the test statistic that you used to answer (2b)?

48.3078 is the value of the test statistic used to answer question.

3a) What is the simple slope of Leaf Width on Leaf Length in Sabah?

1.5609 is the simple slope of Leaf Width on Leaf Length in Sabah

3b) Is the simple slope you reported in (3a) significant at the alpha = 0.05 level?

The simple slope IS statistically significant at the alpha = 0.05 level [t = 8.5725, p-value = 0.0000].

3c) What is the simple slope of Leaf Width on Leaf Length in Panama?


Call:
lm(formula = bladelen ~ bladewid * location2, data = leafshape)

Residuals:
    Min      1Q  Median      3Q     Max 
-22.595  -3.064  -0.604   1.786  38.759 

Coefficients:
                               Estimate Std. Error t value             Pr(>|t|)
(Intercept)                      9.5877     1.1295   8.488  0.00000000000000134
bladewid                         1.1792     0.1138  10.360 < 0.0000000000000002
location2Sabah                  -0.8370     1.7985  -0.465              0.64202
location2Costa Rica              3.9877     1.6692   2.389              0.01757
location2N Queensland           -4.2823     1.8179  -2.356              0.01920
location2S Queensland           -3.4198     1.7040  -2.007              0.04574
location2Tasmania              -10.7288     3.4904  -3.074              0.00233
bladewid:location2Sabah          0.3817     0.2147   1.778              0.07659
bladewid:location2Costa Rica    -0.2100     0.1529  -1.374              0.17057
bladewid:location2N Queensland   0.6878     0.2612   2.633              0.00894
bladewid:location2S Queensland  -0.2208     0.1877  -1.176              0.24050
bladewid:location2Tasmania       2.9090     1.6300   1.785              0.07543
                                  
(Intercept)                    ***
bladewid                       ***
location2Sabah                    
location2Costa Rica            *  
location2N Queensland          *  
location2S Queensland          *  
location2Tasmania              ** 
bladewid:location2Sabah        .  
bladewid:location2Costa Rica      
bladewid:location2N Queensland ** 
bladewid:location2S Queensland    
bladewid:location2Tasmania     .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.438 on 274 degrees of freedom
Multiple R-squared:  0.6598,    Adjusted R-squared:  0.6461 
F-statistic: 48.31 on 11 and 274 DF,  p-value: < 0.00000000000000022
1.1792 is the simple slope of Leaf Width on Leaf Length in Panama

3d) Is the simple slope you reported in (3c) significant at the alpha = 0.05 level?

The simple slope IS statistically significant at the alpha = 0.05 level [t = 10.3595, p-value = 0.0000].

3e) What is the simple slope of Leaf Width on Leaf Length in South Queensland?


Call:
lm(formula = bladelen ~ bladewid * location3, data = leafshape)

Residuals:
    Min      1Q  Median      3Q     Max 
-22.595  -3.064  -0.604   1.786  38.759 

Coefficients:
                               Estimate Std. Error t value       Pr(>|t|)    
(Intercept)                     6.16793    1.27586   4.834 0.000002229164 ***
bladewid                        0.95847    0.14922   6.423 0.000000000587 ***
location3Sabah                  2.58279    1.89379   1.364        0.17374    
location3Panama                 3.41979    1.70401   2.007        0.04574 *  
location3Costa Rica             7.40754    1.77148   4.182 0.000039001083 ***
location3N Queensland          -0.86255    1.91227  -0.451        0.65230    
location3Tasmania              -7.30898    3.54045  -2.064        0.03992 *  
bladewid:location3Sabah         0.60248    0.23542   2.559        0.01103 *  
bladewid:location3Panama        0.22077    0.18768   1.176        0.24050    
bladewid:location3Costa Rica    0.01072    0.18077   0.059        0.95276    
bladewid:location3N Queensland  0.90859    0.27848   3.263        0.00124 ** 
bladewid:location3Tasmania      3.12975    1.63289   1.917        0.05632 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.438 on 274 degrees of freedom
Multiple R-squared:  0.6598,    Adjusted R-squared:  0.6461 
F-statistic: 48.31 on 11 and 274 DF,  p-value: < 0.00000000000000022
0.9585 is the simple slope of Leaf Width on Leaf Length in South Queensland

3f) Is the simple slope you reported in (3e) significant at the alpha = 0.05 level?

The simple slope IS statistically significant at the alpha = 0.05 level [t = 6.4232, p-value = 0.0000].

4a) In which Location is the effect of Leaf Width on Leaf Length strongest?

In Tasmania, the effect of Leaf Width on Leaf Length is the strongest.

4b) What caveat might you want to place on the conclusion reported in (4a)? HINT: Look at the answers to Question 1 of this section.

The caveat is the group sizes are different across Locations.