Research Project Assignment (RPA) #4

Correlations and Regression

Author

Research Methods in Applied Psychology II - APSY-UE-1137

Published

January 1, 2025

Assignment Overview

Group Member Names: [Fill in your team members’ names here]

Predictor construct + measure title: Help-seeking behaviors [grp7_helpseek_1:grp7_helpseek_10]

Outcome construct + measure title: Loneliness [grp2_lone_1:grp2_lone_3]

For this assignment, you will focus on exploring the associations between your predictor, outcome, and other continuous variables using correlation and regression analyses.

Part 1: Identifying an Additional Construct

1a. Additional Construct Selection

Additional Construct: Screen time

Measure Title: Screen time

Variable Name(s): grp1_extra1

Part 2: Preparing Your Data - Scaling

2a. Outcome Scale Recreation

# HINT: Recreate your outcome scale from RPA #2 and #3
# This should include any reverse coding and scale score creation

# data <- read_your_data_here

data <- read_sav("data/data.sav")

# Display the first few rows
head(data)

# A tibble: 6 × 357
  ID    sex        gender_identity sexual_orientation   age race_ethnicity     
  <chr> <dbl+lbl>  <dbl+lbl>       <dbl+lbl>          <dbl> <dbl+lbl>          
1 P0001 2 [Female] 1 [Man]         2 [Gay/Lesbian]       19 1 [White]          
2 P0002 1 [Male]   2 [Woman]       2 [Gay/Lesbian]       18 1 [White]          
3 P0003 1 [Male]   1 [Man]         1 [Heterosexual]      21 2 [Hispanic/Latino]
4 P0004 1 [Male]   1 [Man]         1 [Heterosexual]      21 5 [Multiracial]    
5 P0005 1 [Male]   1 [Man]         2 [Gay/Lesbian]       20 1 [White]          
6 P0006 2 [Female] 3 [Non-binary]  1 [Heterosexual]      19 3 [Asian]          
# ℹ 351 more variables: year_in_school <dbl+lbl>, income <dbl>,
#   greeklife <dbl>, intstatus <dbl>, environment <dbl>, gen <dbl>,
#   dass_1 <dbl>, dass_2 <dbl>, dass_3 <dbl>, dass_4 <dbl>, dass_5 <dbl>,
#   dass_6 <dbl>, dass_7 <dbl>, dass_8 <dbl>, dass_9 <dbl>, dass_10 <dbl>,
#   dass_11 <dbl>, dass_12 <dbl>, dass_13 <dbl>, dass_14 <dbl>, dass_15 <dbl>,
#   dass_16 <dbl>, dass_17 <dbl>, dass_18 <dbl>, dass_19 <dbl>, dass_20 <dbl>,
#   dass_21 <dbl>, swls_1 <dbl>, swls_2 <dbl>, swls_3 <dbl>, swls_4 <dbl>, …

# Recreate your outcome scale
data_with_scores <- data %>%
    # Step 1: Select your items for outcome, predictor and additional construct
    select(ID, grp7_helpseek_1:grp7_helpseek_10, grp2_lone_1:grp2_lone_3, grp1_extra1) %>%
        # Add your outcome items here
        # Example: item1, item2, item3, etc.
    # Step 3: Create your outcome scale
    mutate(
        loneliness = rowSums(select(., grp2_lone_1:grp2_lone_3,grp1_extra1), na.rm = TRUE)
    )

# Check your outcome scale
summary(data_with_scores)

      ID            grp7_helpseek_1 grp7_helpseek_2 grp7_helpseek_3
 Length:200         Min.   :1.000   Min.   :1.000   Min.   :1.000  
 Class :character   1st Qu.:4.000   1st Qu.:5.000   1st Qu.:3.750  
 Mode  :character   Median :5.000   Median :6.000   Median :5.000  
                    Mean   :4.745   Mean   :5.835   Mean   :4.925  
                    3rd Qu.:7.000   3rd Qu.:7.000   3rd Qu.:6.000  
                    Max.   :7.000   Max.   :7.000   Max.   :7.000  
                                                                   
 grp7_helpseek_4 grp7_helpseek_5 grp7_helpseek_6 grp7_helpseek_7
 Min.   :1.00    Min.   :1.000   Min.   :1.000   Min.   :1.000  
 1st Qu.:1.00    1st Qu.:2.000   1st Qu.:1.000   1st Qu.:1.000  
 Median :3.00    Median :4.000   Median :1.000   Median :2.000  
 Mean   :3.25    Mean   :3.995   Mean   :1.835   Mean   :2.085  
 3rd Qu.:5.00    3rd Qu.:5.000   3rd Qu.:2.000   3rd Qu.:2.000  
 Max.   :7.00    Max.   :7.000   Max.   :7.000   Max.   :7.000  
                                                                
 grp7_helpseek_8 grp7_helpseek_10  grp2_lone_1   grp2_lone_2    grp2_lone_3   
 Min.   :1.00    Min.   :1.000    Min.   :1.0   Min.   :1.00   Min.   :1.000  
 1st Qu.:1.00    1st Qu.:1.000    1st Qu.:2.0   1st Qu.:2.00   1st Qu.:2.000  
 Median :2.00    Median :2.000    Median :2.0   Median :3.00   Median :3.000  
 Mean   :2.78    Mean   :2.959    Mean   :2.5   Mean   :2.56   Mean   :2.605  
 3rd Qu.:4.00    3rd Qu.:5.000    3rd Qu.:3.0   3rd Qu.:3.00   3rd Qu.:3.000  
 Max.   :7.00    Max.   :7.000    Max.   :5.0   Max.   :4.00   Max.   :4.000  
                 NA's   :7                                                    
  grp1_extra1       loneliness   
 Min.   : 0.000   Min.   : 3.50  
 1st Qu.: 3.000   1st Qu.:10.00  
 Median : 5.000   Median :12.00  
 Mean   : 4.723   Mean   :12.39  
 3rd Qu.: 6.000   3rd Qu.:15.00  
 Max.   :12.000   Max.   :22.00

2b. Predictor Variable Assessment

Is your predictor a multi-item scale or single item variable? Multi-item

If multi-item, which items need to be reverse coded? None

2c. Predictor Scale Creation

Numeric function for predictor scale: Average all items

# HINT: Create your predictor scale score
# Use reverse-coded items if necessary

# Your code here:
data_with_scores <- data_with_scores %>%
  mutate(help_seeking = rowMeans(select(., grp7_helpseek_1:grp7_helpseek_10), 
                                 na.rm = TRUE)
   )

# Check your predictor scale
summary(data_with_scores$help_seeking)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.333   3.243   3.556   3.602   3.889   6.111

2d. Predictor Descriptive Statistics (Single Item)

# Run descriptive statistics on your predictor variable or scale

# Your code here:
# data_with_scores %>%
#   summarise(
#     n = sum(!is.na(predictor_variable)),
#     mean = mean(predictor_variable, na.rm = TRUE),
#     sd = sd(predictor_variable, na.rm = TRUE)
#   )
# Or try other functions like gtsummary!

Sample size (n): [Number]

Mean: [Value]

Standard deviation: [Value]

2e. Additional Construct Assessment

Is your additional construct a multi-item scale or single item variable? Single-item

If multi-item, which items need to be reverse coded? None”

# HINT: Only complete this if your additional construct is a multi-item scale
# Reverse code items if needed

# Your code here:
# data_with_scores <- data_with_scores %>%
#   mutate(
#     # Add reverse coding for additional construct items
#   )

2f. Additional Construct Scale Creation

Numeric function for additional construct scale: No apply

# HINT: Create your additional construct scale score
# Use reverse-coded items if necessary

# Your code here:
# data_with_scores <- data_with_scores %>%
#   mutate(
#     # Create additional construct scale using rowMeans() or rowSums()
#   )

# Check your additional construct scale
# summary(data_with_scores$additional_scale)

2g. Additional Construct Descriptive Statistics (Single Item)

# HINT: Only complete this if your additional construct is a single item
# Run descriptive statistics on your additional construct variable

# Your code here:
data_with_scores %>%
  summarise(
    n = sum(!is.na(grp1_extra1)),
    mean = mean(grp1_extra1, na.rm = TRUE),
    sd = sd(grp1_extra1, na.rm = TRUE)
  )

# A tibble: 1 × 3
      n  mean    sd
  <int> <dbl> <dbl>
1   200  4.72  2.61

Sample size (n): 200

Mean: 4.72 hours

Standard deviation: 2.61

Part 3: Bivariate Correlations

3a. Correlation Matrix

# HINT: Run bivariate correlations between all three variables
# You can use cor(), cor.test(), or GGally::ggpairs()

# Your code here:
cor.matrix <- cor(data_with_scores[c("loneliness", "help_seeking", "grp1_extra1")])

# See results
cor.matrix

             loneliness help_seeking grp1_extra1
loneliness    1.0000000   -0.1005221   0.7756026
help_seeking -0.1005221    1.0000000  -0.1332641
grp1_extra1   0.7756026   -0.1332641   1.0000000

# For significance tests:
# HINT
# cor.test(data_with_scores$outcome_scale, data_with_scores$predictor_scale)

# Test for correlation between outcome and predictor
cor.test(data_with_scores$loneliness, data_with_scores$help_seeking)


    Pearson's product-moment correlation

data:  data_with_scores$loneliness and data_with_scores$help_seeking
t = -1.4217, df = 198, p-value = 0.1567
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.23597210  0.03875949
sample estimates:
       cor 
-0.1005221

# Test for correlation between outcome and additional construct
cor.test(data_with_scores$loneliness, data_with_scores$grp1_extra1)


    Pearson's product-moment correlation

data:  data_with_scores$loneliness and data_with_scores$grp1_extra1
t = 17.29, df = 198, p-value < 0.00000000000000022
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.7136567 0.8255120
sample estimates:
      cor 
0.7756026

# Test for correlation between predictor and additional construct
cor.test(data_with_scores$grp1_extra1, data_with_scores$help_seeking)


    Pearson's product-moment correlation

data:  data_with_scores$grp1_extra1 and data_with_scores$help_seeking
t = -1.8921, df = 198, p-value = 0.05994
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.26706726  0.00558008
sample estimates:
       cor 
-0.1332641

3b. APA Style Correlation Descriptions

Correlation between Predictor and Outcome:

[Write your APA-style description here - include r value, p-value, significance, magnitude, and direction]

Correlation between Predictor and Additional Construct:

[Write your APA-style description here - include r value, p-value, significance, magnitude, and direction]

Correlation between Outcome and Additional Construct:

[Write your APA-style description here - include r value, p-value, significance, magnitude, and direction]

R

Which construct has the strongest association with your outcome? [Predictor/Additional Construct]

How do you know? [Explain based on correlation coefficients]

3d. R Square Calculations

#R Square for Loneliness (outcome) and Help Seeking (Predictor)
r_pre_out <-  (-0.1005221)^2*100
r_pre_out

[1] 1.010469

#R Square for Loneliness (outcome) and Screen time (Additional Construct)
r_out_add <-  (0.7756026)^2*100
r_out_add

[1] 60.15594

#R Square for Help Seeking (Predictor) and Screen time (Additional Construct):
r_pre_add<-  (-0.1332641)^2*100
r_pre_add

[1] 1.775932

R Square for Predictor and Outcome: [Calculate and write the value]

R Square for Predictor and Additional Construct: [Calculate and write the value]

R Square for Outcome and Additional Construct: [Calculate and write the value]

3e. R Square Interpretation

Definition of R Square:

[Write your definition of R Square in your own words]

R Square Description for Predictor and Outcome:

[Explain what the R Square tells you about shared variability]

R Square Description for Predictor and Additional Construct:

[Explain what the R Square tells you about shared variability]

R Square Description for Outcome and Additional Construct:

[Explain what the R Square tells you about shared variability]

Part 4: Simple Regression

4a. Simple Linear Regression

# HINT: Run simple regression predicting outcome from predictor
# Use lm() function: lm(outcome ~ predictor, data = dataset)
# Standardized use : lm(scale(outcome) ~ scale (predictor))

# Your code here:

# Model 1

model1 <- lm(loneliness ~ help_seeking, 
             data = data_with_scores)

# Display results Model 1
summary(model1)


Call:
lm(formula = loneliness ~ help_seeking, data = data_with_scores)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.7937 -2.3529 -0.4715  2.4100  9.7063 

Coefficients:
             Estimate Std. Error t value            Pr(>|t|)    
(Intercept)   14.3087     1.3702  10.442 <0.0000000000000002 ***
help_seeking  -0.5334     0.3752  -1.422               0.157    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.203 on 198 degrees of freedom
Multiple R-squared:  0.0101,    Adjusted R-squared:  0.005105 
F-statistic: 2.021 on 1 and 198 DF,  p-value: 0.1567

# Model 1 - Standardized Coefficients
model1.2 <- lm(scale(loneliness) ~ scale(help_seeking), 
               data = data_with_scores)

# Display results Model 1- Standardized Coefficients
summary(model1.2)


Call:
lm(formula = scale(loneliness) ~ scale(help_seeking), data = data_with_scores)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.7387 -0.7328 -0.1468  0.7506  3.0229 

Coefficients:
                                   Estimate              Std. Error t value
(Intercept)          0.00000000000000006662  0.07052995046306338722   0.000
scale(help_seeking) -0.10052208117737022885  0.07070693932469082621  -1.422
                    Pr(>|t|)
(Intercept)            1.000
scale(help_seeking)    0.157

Residual standard error: 0.9974 on 198 degrees of freedom
Multiple R-squared:  0.0101,    Adjusted R-squared:  0.005105 
F-statistic: 2.021 on 1 and 198 DF,  p-value: 0.1567

# Display effect size
parameters::model_parameters(model1)

Parameter    | Coefficient |   SE |         95% CI | t(198) |      p
--------------------------------------------------------------------
(Intercept)  |       14.31 | 1.37 | [11.61, 17.01] |  10.44 | < .001
help seeking |       -0.53 | 0.38 | [-1.27,  0.21] |  -1.42 | 0.157

parameters::model_parameters(model1.2)

Parameter    | Coefficient |   SE |        95% CI |   t(198) |      p
---------------------------------------------------------------------
(Intercept)  |    6.66e-17 | 0.07 | [-0.14, 0.14] | 9.45e-16 | > .999
help seeking |       -0.10 | 0.07 | [-0.24, 0.04] |    -1.42 | 0.157

4b. Regression Output Interpretation

R Square - [Define in your own words]

[Interpret for your analysis]

Regression F-test

[Define in your own words]
[Interpret for your analysis]

Intercept

[Define in your own words]
[Interpret for your analysis]

The b coefficient

[Define in your own words]
[Interpret for your analysis]

The Beta coefficient

[Define in your own words]
[Interpret for your analysis]

4c. APA Style Write-Up

Write your results in APA style:

[Write your APA-style results paragraph here - include F-statistic, degrees of freedom, p-value, R², and regression coefficients]

4d. Plain Language Translation

Take-home message for someone outside the class:

[Write 1-2 sentences explaining your results in simple terms]

Part 5: Multiple Regression

5a. Multiple Regression

# HINT: Run Multiple regression
# Model 2: outcome ~ predictor + additional_construct

# Your code here:

# Model 2

model2 <-  lm(loneliness ~ help_seeking+grp1_extra1, 
              data = data_with_scores)
# Display results
summary(model2)


Call:
lm(formula = loneliness ~ help_seeking + grp1_extra1, data = data_with_scores)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.8662 -1.5668 -0.5974  1.3941  5.3907 

Coefficients:
             Estimate Std. Error t value             Pr(>|t|)    
(Intercept)   7.83178    0.95027   8.242   0.0000000000000235 ***
help_seeking  0.01533    0.24078   0.064                0.949    
grp1_extra1   0.95299    0.05573  17.101 < 0.0000000000000002 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.037 on 197 degrees of freedom
Multiple R-squared:  0.6016,    Adjusted R-squared:  0.5975 
F-statistic: 148.7 on 2 and 197 DF,  p-value: < 0.00000000000000022

# Model 2 - Standardized Coefficients
model2.2 <-  lm(scale(loneliness) ~ scale(help_seeking)+ 
                    scale(grp1_extra1), data = data_with_scores)
# Display results
summary(model2.2)


Call:
lm(formula = scale(loneliness) ~ scale(help_seeking) + scale(grp1_extra1), 
    data = data_with_scores)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.5155 -0.4880 -0.1861  0.4342  1.6788 

Coefficients:
                                 Estimate            Std. Error t value
(Intercept)         0.0000000000000003353 0.0448596352315109720   0.000
scale(help_seeking) 0.0028892191727341496 0.0453769435259343268   0.064
scale(grp1_extra1)  0.7759876496149493708 0.0453769435259342921  17.101
                               Pr(>|t|)    
(Intercept)                       1.000    
scale(help_seeking)               0.949    
scale(grp1_extra1)  <0.0000000000000002 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6344 on 197 degrees of freedom
Multiple R-squared:  0.6016,    Adjusted R-squared:  0.5975 
F-statistic: 148.7 on 2 and 197 DF,  p-value: < 0.00000000000000022

# Display effect size+
parameters::model_parameters(model2)

Parameter    | Coefficient |   SE |        95% CI | t(197) |      p
-------------------------------------------------------------------
(Intercept)  |        7.83 | 0.95 | [ 5.96, 9.71] |   8.24 | < .001
help seeking |        0.02 | 0.24 | [-0.46, 0.49] |   0.06 | 0.949 
grp1 extra1  |        0.95 | 0.06 | [ 0.84, 1.06] |  17.10 | < .001

parameters::model_parameters(model2.2)

Parameter    | Coefficient |   SE |        95% CI |   t(197) |      p
---------------------------------------------------------------------
(Intercept)  |    3.35e-16 | 0.04 | [-0.09, 0.09] | 7.47e-15 | > .999
help seeking |    2.89e-03 | 0.05 | [-0.09, 0.09] |     0.06 | 0.949 
grp1 extra1  |        0.78 | 0.05 | [ 0.69, 0.87] |    17.10 | < .001

5b. R Square Comparison

Why is there a difference in R square across the two models?

[Explain why R² changes when adding the additional construct]

5c. Intercept Comparison

Why is there a difference in the y intercept (constant) across the two models?

[Explain why the intercept changes when adding the additional construct]

5d. Coefficient Comparison

Why is there a difference in the coefficients for your predictor across the two models?

[Explain why the predictor coefficient changes when adding the additional construct]

5e. Predictor Association Comparison

Which predictor is more highly associated with your outcome? [Predictor/Additional Construct]

How can you assess this from the output? [Explain how you determined this]

5f. APA Style Write-Up

Write your results in APA style:

[Write your APA-style results paragraph here - include multiple regression models, R², include F-statistic, degrees of freedom, p-value, and regression coefficients]

5g. Plain Language Translation

Take-home message for someone outside the class:

[Write 1-2 sentences explaining your results in simple terms]

5h. Causation Inference

Can you determine if your predictors cause your outcome in this study?

[Explain whether you can infer causation and why or why not]

Visualization (Optional but Recommended)

# HINT: Create visualizations to help interpret your results
# Consider scatterplots, correlation plots, or regression diagnostic plots

# Example scatterplot:
ggplot(data_with_scores, aes(x = help_seeking, y = loneliness)) +
  geom_point() +
  geom_smooth(method = "lm") +
  labs(title = "help_seeking vs loneliness", x = "help_seeking", y = "loneliness")

Submission Instructions

Complete all code chunks and text responses in this document
Ensure all code runs without errors
Save the document as RPA_4_YourTeamName.qmd
Render the document - this will automatically create both HTML and DOCX versions
Submit the .qmd file along with either the .html or .docx file (or both if preferred)
Make sure your team name is clearly indicated at the top of the document

This document was created for Research Methods in Applied Psychology II (APSY-UE-1137) - Fall 2025