Independent variables: SSOverall, STOverall , and SCOverall

Dependent variable: GWA (1st sem SY: 2021-2022)

library(rmarkdown)
library(dplyr)
Warning: package 'dplyr' was built under R version 4.2.1

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(performance)
Warning: package 'performance' was built under R version 4.2.2
library(readxl)
withage <- read_excel("D:/Stat 53/withage.xlsx")
paged_table(withage)

Q1. How many of the observations whose age is at least 21 years old?

Winny <- withage%>%
  mutate(Agecode=ifelse(Age>=21, "at least 21 years old", "Less than 21 years old"))%>%
  group_by(Agecode)%>%
  summarise(count=n())%>%
  mutate(Percentage =round((count/sum(count)*100),2))
paged_table(Winny)

As shown in the above results, there are 72 of them whose age is at least 21 years old.

Q2. How many of the observations whose grades are above 1.25 to 1.75?

Winny <- withage%>%
  mutate(GWAcode=ifelse(`GWA (1st sem SY: 2021-2022)`>=1.25 & `GWA (1st sem SY: 2021-2022)`<=1.75, "GWA is the interval [1.25, 1.75]", "Not in the given interval of GWA"))%>%
  group_by(GWAcode)%>%
  summarise(count=n())%>%
  mutate(Percentage =round((count/sum(count)*100),2))
paged_table(Winny)

As shown in the above results, there are 92 observations whose GWA is in the interval [1.25, 1.75].

Q3. Provide the results in checking the assumptions in running multiple regression analysis.

multiple <- lm(`GWA (1st sem SY: 2021-2022)` ~ SSOverall + STOverall + SCOverall, data = withage)
summary(multiple)

Call:
lm(formula = `GWA (1st sem SY: 2021-2022)` ~ SSOverall + STOverall + 
    SCOverall, data = withage)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.45511 -0.11195 -0.02104  0.10446  0.57345 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.996962   0.173046  11.540   <2e-16 ***
SSOverall   -0.047674   0.038355  -1.243    0.217    
STOverall   -0.067324   0.047959  -1.404    0.163    
SCOverall   -0.005764   0.048030  -0.120    0.905    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1993 on 109 degrees of freedom
Multiple R-squared:  0.07516,   Adjusted R-squared:  0.0497 
F-statistic: 2.953 on 3 and 109 DF,  p-value: 0.03583
check_model(multiple)

Q4. Which of the independent variables significantly predicts the dependent variable?

As shown in the above results, it shows that the model is better than a model with only the intercept because at least one coefficient β is significantly different from 0 with a p -value = 0.03583. It also shows that the independent variable STOverall significantly predicts the dependent variable GWA(1st sem SY: 2021-2022) with p-value results of 0.163.