Replace “Your Name” with your actual name.
Understand and apply the Chi-Square Goodness-of-Fit test
Interpret R-squared and Adjusted R-squared
Compare models using an F-test
Check model assumptions using residual plots
A psychologist is curious if there is a relationship between pet ownership and stress level in college students. Students were categorized as either having a pet or no pet, and whether their stress level was high or low.
You must: Run a chi-square test using chisq.test()
.
Report the chi-square value, degrees of freedom, and p-value.
State your conclusion: Is stress level related to pet ownership?
Run the below code chunk to create the data
pet_data <- matrix(c(18, 35, 27, 13), nrow = 2, byrow = TRUE)
colnames(pet_data) <- c("High Stress", "Low Stress")
rownames(pet_data) <- c("Has Pet", "No Pet")
pet_data
## High Stress Low Stress
## Has Pet 18 35
## No Pet 27 13
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: pet_data
## X-squared = 8.9677, df = 1, p-value = 0.002748
Interpretation:
The chi-square value is 8.97, with 1 degrees of freedom and a p-value of 0.0027.
A psychologist wants to predict academic performance based on different sets of predictors. You’ve already collected the data and loaded it into R.
Use the following code to simulate the data:
set.seed(123)
n <- 100
IQ <- rnorm(n, mean = 100, sd = 15)
motivation <- rnorm(n, mean = 50, sd = 10)
grit <- rnorm(n, mean = 60, sd = 8)
working_memory <- rnorm(n, mean = 50, sd = 10)
academic_perf <- 0.3*IQ + 0.2*motivation + 0.1*grit + 0.1*working_memory + rnorm(n, 0, 10)
iq_data <- data.frame(IQ, motivation, grit, academic_perf, working_memory)
head(iq_data )
## IQ motivation grit academic_perf working_memory
## 1 91.59287 42.89593 77.59048 47.36529 42.84758
## 2 96.54734 52.56884 70.49930 39.08870 42.47311
## 3 123.38062 47.53308 57.87884 50.02267 40.61461
## 4 101.05763 46.52457 64.34555 49.71583 39.47487
## 5 101.93932 40.48381 56.68528 55.61689 45.62840
## 6 125.72597 49.54972 56.19002 42.07245 53.31179
Instructions:
Use summary() to view the R-squared and Adjusted R-squared each model.
##
## Call:
## lm(formula = academic_perf ~ IQ, data = iq_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.4574 -6.4108 0.0701 6.7810 24.6746
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.79161 7.27981 5.191 1.13e-06 ***
## IQ 0.14325 0.07118 2.012 0.0469 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.698 on 98 degrees of freedom
## Multiple R-squared: 0.03968, Adjusted R-squared: 0.02988
## F-statistic: 4.049 on 1 and 98 DF, p-value: 0.04693
# Fit Model 2 using IQ, motivation, and grit
model2 <- lm(academic_perf ~ IQ + motivation + grit, data = iq_data)
summary(model2)
##
## Call:
## lm(formula = academic_perf ~ IQ + motivation + grit, data = iq_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.3922 -6.5107 0.3518 6.8479 24.6197
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 32.39250 12.61901 2.567 0.0118 *
## IQ 0.14767 0.07244 2.038 0.0443 *
## motivation 0.06203 0.10176 0.610 0.5436
## grit 0.03143 0.13043 0.241 0.8101
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.776 on 96 degrees of freedom
## Multiple R-squared: 0.04403, Adjusted R-squared: 0.01416
## F-statistic: 1.474 on 3 and 96 DF, p-value: 0.2265
Which model explains more variance?
Model 2 explains more variance because it includes additional predictors
that account for more variation in academic performance.
Based on the adjusted R-squared values, does adding
predictors improve the model meaningfully?:
If the adjusted R-squared increases in Model 2, it suggests that adding
motivation and grit improves the model’s explanatory power beyond
chance.
You are given two models:
model1 <- lm(academic_perf ~ IQ + working_memory, data = iq_data)
model2 <- lm(academic_perf ~ IQ + working_memory + motivation + grit, data = iq_data)
## Analysis of Variance Table
##
## Model 1: academic_perf ~ IQ + working_memory
## Model 2: academic_perf ~ IQ + working_memory + motivation + grit
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 97 9152.5
## 2 95 9112.5 2 40.005 0.2085 0.8121
Instructions:
Run the code above. What does the p-value from the ANOVA output tell you? Should we keep the more complex model?
Answer:
If the p-value from the ANOVA comparison is less than 0.05, then Model 2 (with more predictors) significantly improves fit, and we should keep it. If the p-value is greater than 0.05, then the simpler model suffices.
Suppose you fitted a linear model:
Instructions:
Create a Q-Q plot of the residuals.
Does the residual distribution look normal?
If the points follow a straight line, the residuals are approximately
normally distributed.
Why does this matter in psychological research?
Because normal residuals validate the use of linear regression—ensuring
accurate p-values and confidence intervals in hypothesis testing.
Why does this matter in psychological research?
Response here
Submission Instructions:
Ensure to knit your document to HTML format, checking that all content is correctly displayed before submission.