Part A: Regression and Statistics Theory
Below you find a part of the regression output from R for the following model:Furthermore, the standard deviation for the dependent variable (mathpre) is given, as well as the model summary
Initialize model values in R
coef = c(1.86205, -0.02668, 0.21205, 0.36823, -0.01151, 0.03227) # coefficients
sde = c(1.79496, 0.01250, 0.04633, 0.10880, 0.04580, 0.04024) # standard error
rse = 2.388 # residual standard error
df = 94 # degrees of freedom
sd_mathpre = 2.85818 # standard deviationAnswer the following questions:
95% confidence interval for Beta1 (mars)
(i) Using the regression compute a 95% confidence interval for Beta1 (mars) and interpret the interval. Using the computed confidence interval, assess whether Beta1 differs from 0 in our (hypothetical) target population.
We can calculate the confidence interval using the function qt returns the value of the inverse cumulative density function (cdf) of the distribution given the degrees of freedom df.
95% of the area under a normal curve lies within roughly 1.96 standard deviations of the mean.
# i. 95% confidence interval
coef[2] + c(-1,1) * qt(.975, df = df) * sde[2]## [1] -0.051499043 -0.001860957
For sanity checking, to check if the coefficient of mars is located between the intervals.
# check confidence intervals
(-0.051499043 + -0.001860957)/2## [1] -0.02668
Since the range of the confidence interval does not cross 0, there is a difference between Beta1 of mars and zero.
t-test for Beta5 (beck)
(ii) Conduct a t-test for Beta5 (beck). Test whether the population slope Beta5 is equal to zero. Interpret the results.
To calculate the t statistics, we use the formula, \(t = B / std. error\)
# t test for beck
tstat = (coef[6])/sde[6]
tstat## [1] 0.8019384
To test whether the population slope Beta5 is equal to zero, we calculate the p value using the t statistics.
pBeta5 = pt(abs(tstat), df = df, lower.tail = FALSE)
pBeta5## [1] 0.2123057
p value is not less than 0.05, hence we can say that population slope Beta5 is equal to zero. This does not necessarily mean that beck is not a significant predictor, but it is an indicator that its individual impact to the model is not as significant as the other predictors.
R squared, Adjusted R squared, and F-test
(iii) Compute the R squared, the adjusted R squared, and conduct an F-test. Interpret all the results.
ANOVA table containing the sum of square, degrees of freedom, mean square and F statistics.| Source of Variance | Sum of Square | Degrees of Freedom | Mean Square | \(F_O\) |
|---|---|---|---|---|
| Regression | \(SS_R\) = 272.711 | k = 5 | \(MS_R\) = 54.54219 | 9.564537 |
| Residual | \(SS_{RES}\) = 536.0391 | n - k - 1 = 94 | \(MS_{RES}\) = 5.7025 | |
| Total | \(SS_T\)= 808.7501 | n - 1 = 99 |
Below is the summary of formulas and derivations
| Variable | Value/ Derived Formula |
|---|---|
| standard deviation | 2.85818 |
| \(\sigma^2\), variance | \(sdmathpre^2\) = 8.169193 |
| df, degrees of freedom | 94 |
| rse, residual standard error | 2.388 |
| k, number of predictor variables | 5 |
| n, number of samples | \(df+k+1\) |
| \(SS_{RES}\) | \(rse^2(df)\) |
| \(SS_T\) | \(\sigma^2 (n-1)\) |
| \(SS_R\) | \(SS_T - SS_{RES}\) |
| \(MS_R\) | \(SS_R / k\) |
| \(MS_{RES}\) | \(SS_{RES} / n-k-1\) |
| \(F_O\) | \(MS_R / MS_{RES}\) |
| \(R^2\) | \(SS_T - SS_{RES} / SS_T\) |
| \(Adjusted R^2\) | \(1-(1-R^2)(n-1)/n-k-1\) |
How they are computed in R:
Residual Sum of Squares, \(SS_{RES}\)
k = 5
n = df + k + 1
variance = sd_mathpre^2
SSres = (rse^2) * df
SSres # Residual Sum of Squares## [1] 536.0391
Total Sum of Squares, \(SS_T\)
SSt = variance*(n-1)
SSt # Total Sum of Squares## [1] 808.7501
Regression Sum of Squares, \(SS_R\)
SSr = SSt - SSres
SSr # Regression Sum of Squares## [1] 272.711
Mean Square Regression, \(MS_R\)
MSr = SSr/k
MSr # Mean Square Regressions ## [1] 54.54219
Mean Square Residual, \(MS_{RES}\)
MSres = SSres/(n-k-1)
MSres # Mean Square Residual## [1] 5.702544
F Statistics, \(F_O\)
Fo = MSr/MSres
Fo # F Statistics## [1] 9.564537
p value
pval_Fo = pf(Fo, k, df, lower.tail = FALSE)
pval_Fo## [1] 2.112382e-07
The p value is determined by the F statistic and is the probability that the results could have happened by chance. We can now compute for the F critical value.
F_crit = 2.3113From the F table, the F critical value is 2.3113. Since F statistic is greater than the F critical, we reject the null hypothesis that all of the regression coefficients are equal to zero. We can say the model can predict mathpre.
We can take a look at the R squared and Adjusted R square to see how well the model fits.
To compute the R squared, we use the sum of squares derived above. Adjusted R Squared on the otherhand, can be calculated using the R squared and the n and k values:
r_squared = (SSt-SSres)/SSt
r_squared # R squared## [1] 0.3372005
adj_r_squared = 1 - (((1-r_squared)*(n-1))/(n-k-1))
adj_r_squared # Adjusted R sqaured## [1] 0.3019452
We arrived with a positive 33.72% R squared and 30.19% Adjust R Squared. With these values, we can say that the regression model fits the data, however it can be further improved by either removing insignificant variables or adding new variables.
tstat1 = coef[2] / sde[2]
pBeta1 = pt(abs(tstat1), df = df, lower.tail = FALSE)
pBeta1 # mars## [1] 0.01770593
tstat2 = coef[3] / sde[3]
pBeta2 = pt(abs(tstat2), df = df, lower.tail = FALSE)
pBeta2 # vocabpre## [1] 7.208571e-06
tstat3 = coef[4] / sde[4]
pBeta3 = pt(abs(tstat3), df = df, lower.tail = FALSE)
pBeta3 # likemath## [1] 0.0005208655
tstat4 = coef[5] / sde[5]
pBeta4 = pt(abs(tstat4), df = df, lower.tail = FALSE)
pBeta4 # age## [1] 0.4010615
pBeta5 # beck## [1] 0.2123057
Considering their individual impact to the model, vocabpre and likemath are also significant factors in predicting mathpre.
Part B: Student Performance Case Study
Importing Student Data set
We first import the data from “students.csv”.
library(psych)## Warning: package 'psych' was built under R version 4.1.3
students <- read.csv("students.csv", stringsAsFactors = TRUE)We convert the categorical data into numbers. We’ll start first with binary categorical data (yes/no).
students$schoolsup <- ifelse(students$schoolsup == "yes", 1, 0)
students$famsup <- ifelse(students$famsup == "yes", 1, 0)
students$paid <- ifelse(students$paid == "yes", 1, 0)
students$activities <- ifelse(students$activities == "yes", 1, 0)
students$nursery <- ifelse(students$nursery == "yes", 1, 0)
students$higher <- ifelse(students$higher == "yes", 1, 0)
students$internet <- ifelse(students$internet == "yes", 1, 0)
students$romantic <- ifelse(students$romantic == "yes", 1, 0)Then, we convert non-binary categorical data into values of True/False or 1/0. We rename the column headers into chosen value in a specific column.
students$school <- ifelse(students$romantic == "Arneo", 1, 0)
names(students)[names(students)=="school"] <- "school_Arneo"
students$sex <- ifelse(students$sex == "F", 1, 0)
names(students)[names(students)=="sex"] <- "sex_female"
students$address <- ifelse(students$address == "R", 1, 0)
names(students)[names(students)=="address"] <- "address_R"
students$famsize <- ifelse(students$famsize == "GT3", 1, 0)
names(students)[names(students)=="famsize"] <- "famsize_GT3"
students$Pstatus <- ifelse(students$Pstatus == "A", 1, 0)
names(students)[names(students)=="Pstatus"] <- "Pstatus_A"lastly, we convert categorical data greater than 2 levels.
Mjob <- as.data.frame(dummy.code(students$Mjob))
names(Mjob) <- paste("Mjob_", names(Mjob), sep="")
Fjob <- as.data.frame(dummy.code(students$Fjob))
names(Fjob) <- paste("Fjob_", names(Fjob), sep="")
reason <- as.data.frame(dummy.code(students$reason))
names(reason) <- paste("reason_", names(reason), sep="")
guardian <- as.data.frame(dummy.code(students$guardian))
names(guardian) <- paste("guardian_", names(guardian), sep="")After converting all categorical data into numbers, we combine them into a single dataframe.
students <- cbind(students, Mjob, Fjob, reason, guardian)
students <- subset(students, select= -c(`Mjob`, `Fjob`, `reason`, `guardian`))
str(students)## 'data.frame': 395 obs. of 46 variables:
## $ school_Arneo : num 0 0 0 0 0 0 0 0 0 0 ...
## $ sex_female : num 1 1 1 1 1 0 0 1 0 0 ...
## $ age : int 18 17 15 15 16 16 16 17 15 15 ...
## $ address_R : num 0 0 0 0 0 0 0 0 0 0 ...
## $ famsize_GT3 : num 1 1 0 1 1 0 0 1 0 1 ...
## $ Pstatus_A : num 1 0 0 0 0 0 0 1 1 0 ...
## $ Medu : int 4 1 1 4 3 4 2 4 3 3 ...
## $ Fedu : int 4 1 1 2 3 3 2 4 2 4 ...
## $ traveltime : int 2 1 1 1 1 1 1 2 1 1 ...
## $ studytime : int 2 2 2 3 2 2 2 2 2 2 ...
## $ failures : int 0 0 3 0 0 0 0 0 0 0 ...
## $ schoolsup : num 1 0 1 0 0 0 0 1 0 0 ...
## $ famsup : num 0 1 0 1 1 1 0 1 1 1 ...
## $ paid : num 0 0 1 1 1 1 0 0 1 1 ...
## $ activities : num 0 0 0 1 0 1 0 0 0 1 ...
## $ nursery : num 1 0 1 1 1 1 1 1 1 1 ...
## $ higher : num 1 1 1 1 1 1 1 1 1 1 ...
## $ internet : num 0 1 1 1 0 1 1 0 1 1 ...
## $ romantic : num 0 0 0 1 0 0 0 0 0 0 ...
## $ famrel : int 4 5 4 3 4 5 4 4 4 5 ...
## $ freetime : int 3 3 3 2 3 4 4 1 2 5 ...
## $ goout : int 4 3 2 2 2 2 4 4 2 1 ...
## $ Dalc : int 1 1 2 1 1 1 1 1 1 1 ...
## $ Walc : int 1 1 3 1 2 2 1 1 1 1 ...
## $ health : int 3 3 3 5 5 5 3 1 1 5 ...
## $ absences : int 6 4 10 2 4 10 0 6 0 0 ...
## $ G1 : int 5 5 7 15 6 15 12 6 16 14 ...
## $ G2 : int 6 5 8 14 10 15 12 5 18 15 ...
## $ G3 : int 6 6 10 15 10 15 11 6 19 15 ...
## $ Mjob_other : num 0 0 0 0 1 0 1 1 0 1 ...
## $ Mjob_services : num 0 0 0 0 0 1 0 0 1 0 ...
## $ Mjob_at_home : num 1 1 1 0 0 0 0 0 0 0 ...
## $ Mjob_teacher : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Mjob_health : num 0 0 0 1 0 0 0 0 0 0 ...
## $ Fjob_other : num 0 1 1 0 1 1 1 0 1 1 ...
## $ Fjob_services : num 0 0 0 1 0 0 0 0 0 0 ...
## $ Fjob_teacher : num 1 0 0 0 0 0 0 1 0 0 ...
## $ Fjob_at_home : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Fjob_health : num 0 0 0 0 0 0 0 0 0 0 ...
## $ reason_course : num 1 1 0 0 0 0 0 0 0 0 ...
## $ reason_home : num 0 0 0 1 1 0 1 1 1 1 ...
## $ reason_reputation: num 0 0 0 0 0 1 0 0 0 0 ...
## $ reason_other : num 0 0 1 0 0 0 0 0 0 0 ...
## $ guardian_mother : num 1 0 1 1 0 1 1 1 1 1 ...
## $ guardian_father : num 0 1 0 0 1 0 0 0 0 0 ...
## $ guardian_other : num 0 0 0 0 0 0 0 0 0 0 ...
We now have a dataframe that has only numerical data!
Full Linear Models
G1 Full Model
We remove G2 and G3 columns for the linear model of G1. We then treat the rest as predictor variables.
##
## Call:
## lm(formula = G1 ~ ., data = df1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.5080 -1.9416 -0.0319 1.8004 7.1358
##
## Coefficients: (5 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 14.04668 3.17856 4.419 1.32e-05 ***
## school_Arneo NA NA NA NA
## sex_female -0.89395 0.34640 -2.581 0.01026 *
## age -0.06913 0.14118 -0.490 0.62470
## address_R -0.14921 0.39670 -0.376 0.70705
## famsize_GT3 -0.42957 0.33803 -1.271 0.20463
## Pstatus_A -0.15441 0.50217 -0.307 0.75865
## Medu 0.11792 0.22420 0.526 0.59925
## Fedu 0.14394 0.19239 0.748 0.45484
## traveltime -0.02432 0.23097 -0.105 0.91621
## studytime 0.60444 0.19893 3.038 0.00255 **
## failures -1.31429 0.23088 -5.692 2.62e-08 ***
## schoolsup -2.15563 0.46250 -4.661 4.46e-06 ***
## famsup -0.97932 0.33023 -2.966 0.00323 **
## paid -0.10213 0.33113 -0.308 0.75794
## activities -0.05332 0.30694 -0.174 0.86218
## nursery 0.02909 0.38009 0.077 0.93904
## higher 1.14209 0.74326 1.537 0.12528
## internet 0.25513 0.42953 0.594 0.55291
## romantic -0.21106 0.32542 -0.649 0.51703
## famrel 0.02547 0.17000 0.150 0.88098
## freetime 0.25506 0.16413 1.554 0.12108
## goout -0.41367 0.15570 -2.657 0.00824 **
## Dalc -0.06307 0.22951 -0.275 0.78363
## Walc -0.02551 0.17180 -0.148 0.88205
## health -0.16760 0.11163 -1.501 0.13415
## absences 0.01222 0.01988 0.615 0.53908
## Mjob_other -1.70825 0.64164 -2.662 0.00811 **
## Mjob_services -0.45977 0.61480 -0.748 0.45506
## Mjob_at_home -0.92644 0.77557 -1.195 0.23307
## Mjob_teacher -1.84882 0.66351 -2.786 0.00562 **
## Mjob_health NA NA NA NA
## Fjob_other -0.58164 0.77457 -0.751 0.45319
## Fjob_services -0.43994 0.77757 -0.566 0.57189
## Fjob_teacher 1.73982 0.90726 1.918 0.05595 .
## Fjob_at_home 0.55394 0.99710 0.556 0.57886
## Fjob_health NA NA NA NA
## reason_course 0.18025 0.56474 0.319 0.74978
## reason_home 0.34588 0.58422 0.592 0.55420
## reason_reputation 0.62368 0.59067 1.056 0.29173
## reason_other NA NA NA NA
## guardian_mother -0.81523 0.63641 -1.281 0.20103
## guardian_father -0.86518 0.69022 -1.253 0.21085
## guardian_other NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.85 on 356 degrees of freedom
## Multiple R-squared: 0.3339, Adjusted R-squared: 0.2628
## F-statistic: 4.696 on 38 and 356 DF, p-value: 1.652e-15
Based on the full linear model of G1, we can see that its p-value < 0.05 which explains that at least one and the combined input variables have a significant effect on the full model. By checking each predictor variables, we can see that failures and schoolsup have p-values that are so small which indicates greater impact on G1 outcome. Together, studytime, famsup, goout, Mjob_other, and Mjob_teacher all have p-values less than 0.01 which are also important in the analysis.
G2 Full Model
We remove G1 and G3 columns for the linear model of G2. We then treat the rest as predictor variables.
## 'data.frame': 395 obs. of 44 variables:
## $ school_Arneo : num 0 0 0 0 0 0 0 0 0 0 ...
## $ sex_female : num 1 1 1 1 1 0 0 1 0 0 ...
## $ age : int 18 17 15 15 16 16 16 17 15 15 ...
## $ address_R : num 0 0 0 0 0 0 0 0 0 0 ...
## $ famsize_GT3 : num 1 1 0 1 1 0 0 1 0 1 ...
## $ Pstatus_A : num 1 0 0 0 0 0 0 1 1 0 ...
## $ Medu : int 4 1 1 4 3 4 2 4 3 3 ...
## $ Fedu : int 4 1 1 2 3 3 2 4 2 4 ...
## $ traveltime : int 2 1 1 1 1 1 1 2 1 1 ...
## $ studytime : int 2 2 2 3 2 2 2 2 2 2 ...
## $ failures : int 0 0 3 0 0 0 0 0 0 0 ...
## $ schoolsup : num 1 0 1 0 0 0 0 1 0 0 ...
## $ famsup : num 0 1 0 1 1 1 0 1 1 1 ...
## $ paid : num 0 0 1 1 1 1 0 0 1 1 ...
## $ activities : num 0 0 0 1 0 1 0 0 0 1 ...
## $ nursery : num 1 0 1 1 1 1 1 1 1 1 ...
## $ higher : num 1 1 1 1 1 1 1 1 1 1 ...
## $ internet : num 0 1 1 1 0 1 1 0 1 1 ...
## $ romantic : num 0 0 0 1 0 0 0 0 0 0 ...
## $ famrel : int 4 5 4 3 4 5 4 4 4 5 ...
## $ freetime : int 3 3 3 2 3 4 4 1 2 5 ...
## $ goout : int 4 3 2 2 2 2 4 4 2 1 ...
## $ Dalc : int 1 1 2 1 1 1 1 1 1 1 ...
## $ Walc : int 1 1 3 1 2 2 1 1 1 1 ...
## $ health : int 3 3 3 5 5 5 3 1 1 5 ...
## $ absences : int 6 4 10 2 4 10 0 6 0 0 ...
## $ G2 : int 6 5 8 14 10 15 12 5 18 15 ...
## $ Mjob_other : num 0 0 0 0 1 0 1 1 0 1 ...
## $ Mjob_services : num 0 0 0 0 0 1 0 0 1 0 ...
## $ Mjob_at_home : num 1 1 1 0 0 0 0 0 0 0 ...
## $ Mjob_teacher : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Mjob_health : num 0 0 0 1 0 0 0 0 0 0 ...
## $ Fjob_other : num 0 1 1 0 1 1 1 0 1 1 ...
## $ Fjob_services : num 0 0 0 1 0 0 0 0 0 0 ...
## $ Fjob_teacher : num 1 0 0 0 0 0 0 1 0 0 ...
## $ Fjob_at_home : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Fjob_health : num 0 0 0 0 0 0 0 0 0 0 ...
## $ reason_course : num 1 1 0 0 0 0 0 0 0 0 ...
## $ reason_home : num 0 0 0 1 1 0 1 1 1 1 ...
## $ reason_reputation: num 0 0 0 0 0 1 0 0 0 0 ...
## $ reason_other : num 0 0 1 0 0 0 0 0 0 0 ...
## $ guardian_mother : num 1 0 1 1 0 1 1 1 1 1 ...
## $ guardian_father : num 0 1 0 0 1 0 0 0 0 0 ...
## $ guardian_other : num 0 0 0 0 0 0 0 0 0 0 ...
##
## Call:
## lm(formula = G2 ~ ., data = df2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -11.381 -1.872 0.092 2.024 7.646
##
## Coefficients: (5 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.298147 3.709308 4.663 4.41e-06 ***
## school_Arneo NA NA NA NA
## sex_female -0.951493 0.404244 -2.354 0.01913 *
## age -0.172679 0.164752 -1.048 0.29530
## address_R -0.398862 0.462943 -0.862 0.38950
## famsize_GT3 -0.621330 0.394469 -1.575 0.11612
## Pstatus_A 0.228515 0.586016 0.390 0.69681
## Medu 0.317827 0.261630 1.215 0.22525
## Fedu 0.006509 0.224509 0.029 0.97689
## traveltime -0.326927 0.269537 -1.213 0.22597
## studytime 0.556826 0.232149 2.399 0.01697 *
## failures -1.376552 0.269434 -5.109 5.29e-07 ***
## schoolsup -1.468419 0.539727 -2.721 0.00683 **
## famsup -0.908141 0.385375 -2.357 0.01899 *
## paid 0.302538 0.386424 0.783 0.43420
## activities 0.012523 0.358190 0.035 0.97213
## nursery 0.028888 0.443559 0.065 0.94811
## higher 1.008248 0.867366 1.162 0.24584
## internet 0.613606 0.501254 1.224 0.22171
## romantic -0.813367 0.379759 -2.142 0.03289 *
## famrel -0.142657 0.198386 -0.719 0.47256
## freetime 0.222654 0.191538 1.162 0.24583
## goout -0.552966 0.181697 -3.043 0.00251 **
## Dalc -0.076683 0.267829 -0.286 0.77481
## Walc 0.091159 0.200492 0.455 0.64962
## health -0.219217 0.130275 -1.683 0.09330 .
## absences 0.007108 0.023203 0.306 0.75952
## Mjob_other -1.306330 0.748774 -1.745 0.08191 .
## Mjob_services -0.470345 0.717462 -0.656 0.51253
## Mjob_at_home -1.020420 0.905073 -1.127 0.26031
## Mjob_teacher -2.097446 0.774304 -2.709 0.00708 **
## Mjob_health NA NA NA NA
## Fjob_other -0.467092 0.903900 -0.517 0.60565
## Fjob_services -0.063045 0.907406 -0.069 0.94465
## Fjob_teacher 1.127161 1.058753 1.065 0.28777
## Fjob_at_home -0.112199 1.163593 -0.096 0.92324
## Fjob_health NA NA NA NA
## reason_course -0.550519 0.659043 -0.835 0.40409
## reason_home -0.282107 0.681766 -0.414 0.67928
## reason_reputation -0.147244 0.689293 -0.214 0.83097
## reason_other NA NA NA NA
## guardian_mother -0.723643 0.742680 -0.974 0.33054
## guardian_father -0.575239 0.805476 -0.714 0.47560
## guardian_other NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.326 on 356 degrees of freedom
## Multiple R-squared: 0.2936, Adjusted R-squared: 0.2182
## F-statistic: 3.895 on 38 and 356 DF, p-value: 5.88e-12
Based on the full linear model of G2, we can see that its p-value < 0.05 which explains that at least one and the combined input variables have a significant effect on the full model. By checking each predictor variables, we can see that failures has a p-value that is so small which indicates greater impact on G1 outcome. Together, sexfemale, studytime, famsup, schoolsup, goout, romantic, and Mjob_teacher all have p-values less than 0.01 which are also important in the analysis.
G3 Full Model
We remove G1 and G2 columns for the linear model of G3. We then treat the rest as predictor variables.
## 'data.frame': 395 obs. of 44 variables:
## $ school_Arneo : num 0 0 0 0 0 0 0 0 0 0 ...
## $ sex_female : num 1 1 1 1 1 0 0 1 0 0 ...
## $ age : int 18 17 15 15 16 16 16 17 15 15 ...
## $ address_R : num 0 0 0 0 0 0 0 0 0 0 ...
## $ famsize_GT3 : num 1 1 0 1 1 0 0 1 0 1 ...
## $ Pstatus_A : num 1 0 0 0 0 0 0 1 1 0 ...
## $ Medu : int 4 1 1 4 3 4 2 4 3 3 ...
## $ Fedu : int 4 1 1 2 3 3 2 4 2 4 ...
## $ traveltime : int 2 1 1 1 1 1 1 2 1 1 ...
## $ studytime : int 2 2 2 3 2 2 2 2 2 2 ...
## $ failures : int 0 0 3 0 0 0 0 0 0 0 ...
## $ schoolsup : num 1 0 1 0 0 0 0 1 0 0 ...
## $ famsup : num 0 1 0 1 1 1 0 1 1 1 ...
## $ paid : num 0 0 1 1 1 1 0 0 1 1 ...
## $ activities : num 0 0 0 1 0 1 0 0 0 1 ...
## $ nursery : num 1 0 1 1 1 1 1 1 1 1 ...
## $ higher : num 1 1 1 1 1 1 1 1 1 1 ...
## $ internet : num 0 1 1 1 0 1 1 0 1 1 ...
## $ romantic : num 0 0 0 1 0 0 0 0 0 0 ...
## $ famrel : int 4 5 4 3 4 5 4 4 4 5 ...
## $ freetime : int 3 3 3 2 3 4 4 1 2 5 ...
## $ goout : int 4 3 2 2 2 2 4 4 2 1 ...
## $ Dalc : int 1 1 2 1 1 1 1 1 1 1 ...
## $ Walc : int 1 1 3 1 2 2 1 1 1 1 ...
## $ health : int 3 3 3 5 5 5 3 1 1 5 ...
## $ absences : int 6 4 10 2 4 10 0 6 0 0 ...
## $ G3 : int 6 6 10 15 10 15 11 6 19 15 ...
## $ Mjob_other : num 0 0 0 0 1 0 1 1 0 1 ...
## $ Mjob_services : num 0 0 0 0 0 1 0 0 1 0 ...
## $ Mjob_at_home : num 1 1 1 0 0 0 0 0 0 0 ...
## $ Mjob_teacher : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Mjob_health : num 0 0 0 1 0 0 0 0 0 0 ...
## $ Fjob_other : num 0 1 1 0 1 1 1 0 1 1 ...
## $ Fjob_services : num 0 0 0 1 0 0 0 0 0 0 ...
## $ Fjob_teacher : num 1 0 0 0 0 0 0 1 0 0 ...
## $ Fjob_at_home : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Fjob_health : num 0 0 0 0 0 0 0 0 0 0 ...
## $ reason_course : num 1 1 0 0 0 0 0 0 0 0 ...
## $ reason_home : num 0 0 0 1 1 0 1 1 1 1 ...
## $ reason_reputation: num 0 0 0 0 0 1 0 0 0 0 ...
## $ reason_other : num 0 0 1 0 0 0 0 0 0 0 ...
## $ guardian_mother : num 1 0 1 1 0 1 1 1 1 1 ...
## $ guardian_father : num 0 1 0 0 1 0 0 0 0 0 ...
## $ guardian_other : num 0 0 0 0 0 0 0 0 0 0 ...
##
## Call:
## lm(formula = G3 ~ ., data = df3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -13.2544 -1.8733 0.4061 2.6915 8.6700
##
## Coefficients: (5 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 18.06973 4.58066 3.945 9.62e-05 ***
## school_Arneo NA NA NA NA
## sex_female -1.23790 0.49920 -2.480 0.01361 *
## age -0.30554 0.20345 -1.502 0.13405
## address_R -0.44206 0.57169 -0.773 0.43989
## famsize_GT3 -0.73139 0.48713 -1.501 0.13413
## Pstatus_A 0.31174 0.72368 0.431 0.66689
## Medu 0.45495 0.32309 1.408 0.15997
## Fedu -0.09257 0.27725 -0.334 0.73865
## traveltime -0.18190 0.33285 -0.546 0.58508
## studytime 0.52860 0.28668 1.844 0.06604 .
## failures -1.73161 0.33273 -5.204 3.30e-07 ***
## schoolsup -1.36781 0.66651 -2.052 0.04088 *
## famsup -0.90818 0.47590 -1.908 0.05715 .
## paid 0.35859 0.47720 0.751 0.45288
## activities -0.37279 0.44233 -0.843 0.39991
## nursery -0.21365 0.54776 -0.390 0.69674
## higher 1.47826 1.07112 1.380 0.16842
## internet 0.47741 0.61900 0.771 0.44107
## romantic -1.08274 0.46897 -2.309 0.02153 *
## famrel 0.21248 0.24499 0.867 0.38636
## freetime 0.31991 0.23653 1.353 0.17707
## goout -0.59904 0.22438 -2.670 0.00794 **
## Dalc -0.26663 0.33074 -0.806 0.42069
## Walc 0.25105 0.24759 1.014 0.31128
## health -0.18193 0.16088 -1.131 0.25887
## absences 0.05244 0.02865 1.830 0.06806 .
## Mjob_other -1.34401 0.92467 -1.454 0.14697
## Mjob_services -0.35149 0.88600 -0.397 0.69181
## Mjob_at_home -1.01979 1.11768 -0.912 0.36217
## Mjob_teacher -2.23178 0.95620 -2.334 0.02015 *
## Mjob_health NA NA NA NA
## Fjob_other -0.97976 1.11624 -0.878 0.38068
## Fjob_services -0.76330 1.12056 -0.681 0.49620
## Fjob_teacher 0.93643 1.30746 0.716 0.47433
## Fjob_at_home -0.30640 1.43693 -0.213 0.83127
## Fjob_health NA NA NA NA
## reason_course -0.84663 0.81386 -1.040 0.29892
## reason_home -0.76652 0.84192 -0.910 0.36321
## reason_reputation -0.27546 0.85121 -0.324 0.74642
## reason_other NA NA NA NA
## guardian_mother -0.61281 0.91714 -0.668 0.50446
## guardian_father -0.66279 0.99469 -0.666 0.50563
## guardian_other NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.107 on 356 degrees of freedom
## Multiple R-squared: 0.2739, Adjusted R-squared: 0.1964
## F-statistic: 3.533 on 38 and 356 DF, p-value: 2.357e-10
Based on the full linear model of G2, we can see that its p-value < 0.05 which explains that at least one and the combined input variables have a significant effect on the full model. Consistent in the three full linear models, failures, indicated with 3 stars (***) have the most significant effect out of all the predictor variables. Other important predictor variables that are consistent are:
schoolsup
goout
Mjob_teacher
Checking for outliers: studentized residuals
For the checking of outliers, we used Studentized Residuals. Since we don’t have outliers at h > 3, we used the lower end of limit, h < -3 to check for outliers. We then see that we have one outlier.
We can see that we have 1 outlier for the full linear model of G3.
To identify the specific row of the outlier, we used below code. It was identified that row 260 is the outlier. We then delete row 260 from our dataframe, df3.
## 260
## 260
We now get the linear model of student dataset w/o the outlier.
G3_new <- lm(G3 ~ ., data= df3)
summary(G3_new)##
## Call:
## lm(formula = G3 ~ ., data = df3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.7797 -1.8629 0.3995 2.6576 9.2260
##
## Coefficients: (5 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.95534 4.51232 3.979 8.39e-05 ***
## school_Arneo NA NA NA NA
## sex_female -1.18388 0.49199 -2.406 0.01663 *
## age -0.29149 0.20045 -1.454 0.14679
## address_R -0.46193 0.56318 -0.820 0.41264
## famsize_GT3 -0.84758 0.48104 -1.762 0.07893 .
## Pstatus_A 0.29846 0.71287 0.419 0.67570
## Medu 0.41270 0.31850 1.296 0.19590
## Fedu -0.08864 0.27311 -0.325 0.74571
## traveltime -0.20834 0.32797 -0.635 0.52569
## studytime 0.63677 0.28414 2.241 0.02564 *
## failures -1.73305 0.32775 -5.288 2.17e-07 ***
## schoolsup -1.40282 0.65663 -2.136 0.03333 *
## famsup -1.04547 0.47048 -2.222 0.02690 *
## paid 0.48686 0.47154 1.032 0.30254
## activities -0.29690 0.43628 -0.681 0.49661
## nursery -0.18293 0.53964 -0.339 0.73482
## higher 1.46268 1.05512 1.386 0.16654
## internet 0.46950 0.60976 0.770 0.44183
## romantic -0.96966 0.46312 -2.094 0.03699 *
## famrel 0.15784 0.24185 0.653 0.51440
## freetime 0.40206 0.23421 1.717 0.08692 .
## goout -0.67856 0.22223 -3.053 0.00243 **
## Dalc -0.26306 0.32580 -0.807 0.41996
## Walc 0.24477 0.24390 1.004 0.31627
## health -0.21871 0.15883 -1.377 0.16939
## absences 0.05191 0.02823 1.839 0.06674 .
## Mjob_other -1.40726 0.91103 -1.545 0.12331
## Mjob_services -0.25689 0.87319 -0.294 0.76878
## Mjob_at_home -1.12966 1.10144 -1.026 0.30577
## Mjob_teacher -2.28914 0.94205 -2.430 0.01560 *
## Mjob_health NA NA NA NA
## Fjob_other -0.94977 1.09959 -0.864 0.38831
## Fjob_services -0.67589 1.10411 -0.612 0.54082
## Fjob_teacher 1.00437 1.28807 0.780 0.43606
## Fjob_at_home -0.43394 1.41594 -0.306 0.75943
## Fjob_health NA NA NA NA
## reason_course -0.67235 0.80329 -0.837 0.40316
## reason_home -0.71445 0.82947 -0.861 0.38964
## reason_reputation -0.28130 0.83849 -0.335 0.73746
## reason_other NA NA NA NA
## guardian_mother -0.54427 0.90365 -0.602 0.54736
## guardian_father -0.43856 0.98198 -0.447 0.65543
## guardian_other NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.046 on 355 degrees of freedom
## Multiple R-squared: 0.288, Adjusted R-squared: 0.2118
## F-statistic: 3.779 on 38 and 355 DF, p-value: 1.937e-11
Identify significant variables
To identify the significant variables, we use the Stepwise Regression Method.
Stepwise= step(G3_new, scope = list(lower=~1,upper=~., direction = "both", trace=1))## Start: AIC=1138.28
## G3 ~ school_Arneo + sex_female + age + address_R + famsize_GT3 +
## Pstatus_A + Medu + Fedu + traveltime + studytime + failures +
## schoolsup + famsup + paid + activities + nursery + higher +
## internet + romantic + famrel + freetime + goout + Dalc +
## Walc + health + absences + Mjob_other + Mjob_services + Mjob_at_home +
## Mjob_teacher + Mjob_health + Fjob_other + Fjob_services +
## Fjob_teacher + Fjob_at_home + Fjob_health + reason_course +
## reason_home + reason_reputation + reason_other + guardian_mother +
## guardian_father + guardian_other
##
##
## Step: AIC=1138.28
## G3 ~ school_Arneo + sex_female + age + address_R + famsize_GT3 +
## Pstatus_A + Medu + Fedu + traveltime + studytime + failures +
## schoolsup + famsup + paid + activities + nursery + higher +
## internet + romantic + famrel + freetime + goout + Dalc +
## Walc + health + absences + Mjob_other + Mjob_services + Mjob_at_home +
## Mjob_teacher + Mjob_health + Fjob_other + Fjob_services +
## Fjob_teacher + Fjob_at_home + Fjob_health + reason_course +
## reason_home + reason_reputation + reason_other + guardian_mother +
## guardian_father
##
##
## Step: AIC=1138.28
## G3 ~ school_Arneo + sex_female + age + address_R + famsize_GT3 +
## Pstatus_A + Medu + Fedu + traveltime + studytime + failures +
## schoolsup + famsup + paid + activities + nursery + higher +
## internet + romantic + famrel + freetime + goout + Dalc +
## Walc + health + absences + Mjob_other + Mjob_services + Mjob_at_home +
## Mjob_teacher + Mjob_health + Fjob_other + Fjob_services +
## Fjob_teacher + Fjob_at_home + Fjob_health + reason_course +
## reason_home + reason_reputation + guardian_mother + guardian_father
##
##
## Step: AIC=1138.28
## G3 ~ school_Arneo + sex_female + age + address_R + famsize_GT3 +
## Pstatus_A + Medu + Fedu + traveltime + studytime + failures +
## schoolsup + famsup + paid + activities + nursery + higher +
## internet + romantic + famrel + freetime + goout + Dalc +
## Walc + health + absences + Mjob_other + Mjob_services + Mjob_at_home +
## Mjob_teacher + Mjob_health + Fjob_other + Fjob_services +
## Fjob_teacher + Fjob_at_home + reason_course + reason_home +
## reason_reputation + guardian_mother + guardian_father
##
##
## Step: AIC=1138.28
## G3 ~ school_Arneo + sex_female + age + address_R + famsize_GT3 +
## Pstatus_A + Medu + Fedu + traveltime + studytime + failures +
## schoolsup + famsup + paid + activities + nursery + higher +
## internet + romantic + famrel + freetime + goout + Dalc +
## Walc + health + absences + Mjob_other + Mjob_services + Mjob_at_home +
## Mjob_teacher + Fjob_other + Fjob_services + Fjob_teacher +
## Fjob_at_home + reason_course + reason_home + reason_reputation +
## guardian_mother + guardian_father
##
##
## Step: AIC=1138.28
## G3 ~ sex_female + age + address_R + famsize_GT3 + Pstatus_A +
## Medu + Fedu + traveltime + studytime + failures + schoolsup +
## famsup + paid + activities + nursery + higher + internet +
## romantic + famrel + freetime + goout + Dalc + Walc + health +
## absences + Mjob_other + Mjob_services + Mjob_at_home + Mjob_teacher +
## Fjob_other + Fjob_services + Fjob_teacher + Fjob_at_home +
## reason_course + reason_home + reason_reputation + guardian_mother +
## guardian_father
##
## Df Sum of Sq RSS AIC
## - Mjob_services 1 1.42 5811.9 1136.4
## - Fjob_at_home 1 1.54 5812.0 1136.4
## - Fedu 1 1.72 5812.2 1136.4
## - reason_reputation 1 1.84 5812.3 1136.4
## - nursery 1 1.88 5812.4 1136.4
## - Pstatus_A 1 2.87 5813.4 1136.5
## - guardian_father 1 3.26 5813.7 1136.5
## - guardian_mother 1 5.94 5816.4 1136.7
## - Fjob_services 1 6.13 5816.6 1136.7
## - traveltime 1 6.60 5817.1 1136.7
## - famrel 1 6.97 5817.5 1136.8
## - activities 1 7.58 5818.1 1136.8
## - internet 1 9.70 5820.2 1136.9
## - Fjob_teacher 1 9.95 5820.4 1137.0
## - Dalc 1 10.67 5821.2 1137.0
## - address_R 1 11.01 5821.5 1137.0
## - reason_course 1 11.47 5822.0 1137.1
## - reason_home 1 12.14 5822.6 1137.1
## - Fjob_other 1 12.21 5822.7 1137.1
## - Walc 1 16.48 5827.0 1137.4
## - Mjob_at_home 1 17.22 5827.7 1137.5
## - paid 1 17.45 5827.9 1137.5
## - Medu 1 27.48 5838.0 1138.1
## <none> 5810.5 1138.3
## - health 1 31.03 5841.5 1138.4
## - higher 1 31.45 5841.9 1138.4
## - age 1 34.61 5845.1 1138.6
## - Mjob_other 1 39.05 5849.5 1138.9
## - freetime 1 48.23 5858.7 1139.5
## - famsize_GT3 1 50.81 5861.3 1139.7
## - absences 1 55.36 5865.8 1140.0
## - romantic 1 71.75 5882.2 1141.1
## - schoolsup 1 74.70 5885.2 1141.3
## - famsup 1 80.82 5891.3 1141.7
## - studytime 1 82.20 5892.7 1141.8
## - sex_female 1 94.77 5905.3 1142.7
## - Mjob_teacher 1 96.64 5907.1 1142.8
## - goout 1 152.61 5963.1 1146.5
## - failures 1 457.63 6268.1 1166.2
##
## Step: AIC=1136.38
## G3 ~ sex_female + age + address_R + famsize_GT3 + Pstatus_A +
## Medu + Fedu + traveltime + studytime + failures + schoolsup +
## famsup + paid + activities + nursery + higher + internet +
## romantic + famrel + freetime + goout + Dalc + Walc + health +
## absences + Mjob_other + Mjob_at_home + Mjob_teacher + Fjob_other +
## Fjob_services + Fjob_teacher + Fjob_at_home + reason_course +
## reason_home + reason_reputation + guardian_mother + guardian_father
##
## Df Sum of Sq RSS AIC
## - nursery 1 1.79 5813.7 1134.5
## - reason_reputation 1 2.00 5813.9 1134.5
## - Fedu 1 2.02 5813.9 1134.5
## - Fjob_at_home 1 2.12 5814.0 1134.5
## - Pstatus_A 1 2.65 5814.6 1134.6
## - guardian_father 1 3.30 5815.2 1134.6
## - guardian_mother 1 6.24 5818.1 1134.8
## - famrel 1 6.47 5818.4 1134.8
## - traveltime 1 6.88 5818.8 1134.8
## - Fjob_services 1 7.34 5819.2 1134.9
## - activities 1 7.70 5819.6 1134.9
## - Fjob_teacher 1 8.94 5820.8 1135.0
## - internet 1 10.04 5821.9 1135.1
## - address_R 1 10.91 5822.8 1135.1
## - Dalc 1 11.87 5823.8 1135.2
## - reason_course 1 12.20 5824.1 1135.2
## - reason_home 1 12.51 5824.4 1135.2
## - Fjob_other 1 13.89 5825.8 1135.3
## - Walc 1 17.37 5829.3 1135.5
## - paid 1 17.47 5829.4 1135.6
## - Mjob_at_home 1 21.76 5833.7 1135.8
## <none> 5811.9 1136.4
## - health 1 30.56 5842.5 1136.4
## - higher 1 31.91 5843.8 1136.5
## - Medu 1 32.57 5844.5 1136.6
## - age 1 34.80 5846.7 1136.7
## - freetime 1 48.48 5860.4 1137.7
## - famsize_GT3 1 51.35 5863.2 1137.8
## - absences 1 54.19 5866.1 1138.0
## + Mjob_services 1 1.42 5810.5 1138.3
## + Mjob_health 1 1.42 5810.5 1138.3
## - romantic 1 71.20 5883.1 1139.2
## - Mjob_other 1 74.16 5886.1 1139.4
## - schoolsup 1 76.71 5888.6 1139.5
## - famsup 1 80.24 5892.1 1139.8
## - studytime 1 80.98 5892.9 1139.8
## - sex_female 1 94.39 5906.3 1140.7
## - Mjob_teacher 1 142.53 5954.4 1143.9
## - goout 1 151.76 5963.7 1144.5
## - failures 1 460.35 6272.2 1164.4
##
## Step: AIC=1134.5
## G3 ~ sex_female + age + address_R + famsize_GT3 + Pstatus_A +
## Medu + Fedu + traveltime + studytime + failures + schoolsup +
## famsup + paid + activities + higher + internet + romantic +
## famrel + freetime + goout + Dalc + Walc + health + absences +
## Mjob_other + Mjob_at_home + Mjob_teacher + Fjob_other + Fjob_services +
## Fjob_teacher + Fjob_at_home + reason_course + reason_home +
## reason_reputation + guardian_mother + guardian_father
##
## Df Sum of Sq RSS AIC
## - reason_reputation 1 2.04 5815.7 1132.6
## - Fedu 1 2.30 5816.0 1132.7
## - Fjob_at_home 1 2.42 5816.1 1132.7
## - Pstatus_A 1 2.43 5816.1 1132.7
## - guardian_father 1 4.07 5817.8 1132.8
## - famrel 1 6.51 5820.2 1132.9
## - traveltime 1 7.29 5821.0 1133.0
## - activities 1 7.36 5821.1 1133.0
## - Fjob_services 1 7.47 5821.2 1133.0
## - guardian_mother 1 7.59 5821.3 1133.0
## - Fjob_teacher 1 9.04 5822.7 1133.1
## - address_R 1 10.60 5824.3 1133.2
## - internet 1 10.66 5824.3 1133.2
## - Dalc 1 11.45 5825.1 1133.3
## - reason_course 1 12.01 5825.7 1133.3
## - reason_home 1 12.58 5826.3 1133.3
## - Fjob_other 1 13.90 5827.6 1133.4
## - paid 1 16.70 5830.4 1133.6
## - Walc 1 18.23 5831.9 1133.7
## - Mjob_at_home 1 21.42 5835.1 1134.0
## <none> 5813.7 1134.5
## - health 1 30.62 5844.3 1134.6
## - Medu 1 31.85 5845.5 1134.7
## - higher 1 31.87 5845.6 1134.7
## - age 1 34.90 5848.6 1134.9
## - freetime 1 48.35 5862.0 1135.8
## - famsize_GT3 1 50.00 5863.7 1135.9
## - absences 1 53.51 5867.2 1136.1
## + nursery 1 1.79 5811.9 1136.4
## + Mjob_services 1 1.33 5812.4 1136.4
## + Mjob_health 1 1.33 5812.4 1136.4
## - romantic 1 72.09 5885.8 1137.3
## - Mjob_other 1 73.37 5887.1 1137.4
## - schoolsup 1 77.95 5891.6 1137.8
## - studytime 1 79.79 5893.5 1137.9
## - famsup 1 80.17 5893.9 1137.9
## - sex_female 1 93.31 5907.0 1138.8
## - Mjob_teacher 1 142.94 5956.6 1142.1
## - goout 1 153.55 5967.2 1142.8
## - failures 1 461.52 6275.2 1162.6
##
## Step: AIC=1132.64
## G3 ~ sex_female + age + address_R + famsize_GT3 + Pstatus_A +
## Medu + Fedu + traveltime + studytime + failures + schoolsup +
## famsup + paid + activities + higher + internet + romantic +
## famrel + freetime + goout + Dalc + Walc + health + absences +
## Mjob_other + Mjob_at_home + Mjob_teacher + Fjob_other + Fjob_services +
## Fjob_teacher + Fjob_at_home + reason_course + reason_home +
## guardian_mother + guardian_father
##
## Df Sum of Sq RSS AIC
## - Pstatus_A 1 2.38 5818.1 1130.8
## - Fedu 1 2.52 5818.3 1130.8
## - Fjob_at_home 1 2.58 5818.3 1130.8
## - guardian_father 1 4.06 5819.8 1130.9
## - famrel 1 6.52 5822.2 1131.1
## - guardian_mother 1 7.42 5823.2 1131.1
## - traveltime 1 7.45 5823.2 1131.1
## - Fjob_services 1 7.52 5823.3 1131.2
## - activities 1 7.76 5823.5 1131.2
## - Fjob_teacher 1 9.38 5825.1 1131.3
## - Dalc 1 10.39 5826.1 1131.3
## - internet 1 10.50 5826.2 1131.3
## - address_R 1 10.71 5826.4 1131.4
## - reason_course 1 13.46 5829.2 1131.5
## - reason_home 1 14.13 5829.9 1131.6
## - Fjob_other 1 14.40 5830.1 1131.6
## - paid 1 17.68 5833.4 1131.8
## - Walc 1 18.05 5833.8 1131.9
## - Mjob_at_home 1 20.89 5836.6 1132.0
## <none> 5815.7 1132.6
## - health 1 29.80 5845.5 1132.7
## - higher 1 30.59 5846.3 1132.7
## - Medu 1 32.02 5847.7 1132.8
## - age 1 34.87 5850.6 1133.0
## - famsize_GT3 1 49.11 5864.8 1134.0
## - freetime 1 49.16 5864.9 1134.0
## - absences 1 52.47 5868.2 1134.2
## + reason_reputation 1 2.04 5813.7 1134.5
## + reason_other 1 2.04 5813.7 1134.5
## + nursery 1 1.83 5813.9 1134.5
## + Mjob_services 1 1.48 5814.3 1134.5
## + Mjob_health 1 1.48 5814.3 1134.5
## - romantic 1 70.62 5886.4 1135.4
## - Mjob_other 1 75.17 5890.9 1135.7
## - studytime 1 77.80 5893.5 1135.9
## - schoolsup 1 77.93 5893.7 1135.9
## - famsup 1 83.98 5899.7 1136.3
## - sex_female 1 92.70 5908.4 1136.9
## - Mjob_teacher 1 144.55 5960.3 1140.3
## - goout 1 154.90 5970.6 1141.0
## - failures 1 467.27 6283.0 1161.1
##
## Step: AIC=1130.8
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + Fedu +
## traveltime + studytime + failures + schoolsup + famsup +
## paid + activities + higher + internet + romantic + famrel +
## freetime + goout + Dalc + Walc + health + absences + Mjob_other +
## Mjob_at_home + Mjob_teacher + Fjob_other + Fjob_services +
## Fjob_teacher + Fjob_at_home + reason_course + reason_home +
## guardian_mother + guardian_father
##
## Df Sum of Sq RSS AIC
## - Fedu 1 2.33 5820.4 1129.0
## - Fjob_at_home 1 2.52 5820.6 1129.0
## - guardian_father 1 4.78 5822.9 1129.1
## - famrel 1 6.56 5824.7 1129.2
## - traveltime 1 7.48 5825.6 1129.3
## - guardian_mother 1 7.88 5826.0 1129.3
## - Fjob_services 1 7.90 5826.0 1129.3
## - activities 1 8.71 5826.8 1129.4
## - Fjob_teacher 1 9.41 5827.5 1129.4
## - internet 1 9.64 5827.8 1129.5
## - Dalc 1 10.10 5828.2 1129.5
## - address_R 1 10.90 5829.0 1129.5
## - reason_course 1 13.21 5831.3 1129.7
## - reason_home 1 13.90 5832.0 1129.7
## - Fjob_other 1 14.74 5832.9 1129.8
## - paid 1 17.11 5835.2 1130.0
## - Walc 1 17.99 5836.1 1130.0
## - Mjob_at_home 1 21.52 5839.6 1130.2
## <none> 5818.1 1130.8
## - health 1 30.24 5848.4 1130.8
## - higher 1 30.95 5849.1 1130.9
## - Medu 1 33.78 5851.9 1131.1
## - age 1 36.39 5854.5 1131.2
## - freetime 1 48.59 5866.7 1132.1
## - famsize_GT3 1 53.86 5872.0 1132.4
## - absences 1 55.57 5873.7 1132.5
## + Pstatus_A 1 2.38 5815.7 1132.6
## + reason_reputation 1 1.99 5816.1 1132.7
## + reason_other 1 1.99 5816.1 1132.7
## + nursery 1 1.61 5816.5 1132.7
## + Mjob_services 1 1.27 5816.8 1132.7
## + Mjob_health 1 1.27 5816.8 1132.7
## - romantic 1 70.15 5888.3 1133.5
## - Mjob_other 1 74.97 5893.1 1133.8
## - schoolsup 1 77.44 5895.6 1134.0
## - studytime 1 77.99 5896.1 1134.0
## - famsup 1 84.88 5903.0 1134.5
## - sex_female 1 92.46 5910.6 1135.0
## - Mjob_teacher 1 148.98 5967.1 1138.8
## - goout 1 155.29 5973.4 1139.2
## - failures 1 466.27 6284.4 1159.2
##
## Step: AIC=1128.96
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + traveltime +
## studytime + failures + schoolsup + famsup + paid + activities +
## higher + internet + romantic + famrel + freetime + goout +
## Dalc + Walc + health + absences + Mjob_other + Mjob_at_home +
## Mjob_teacher + Fjob_other + Fjob_services + Fjob_teacher +
## Fjob_at_home + reason_course + reason_home + guardian_mother +
## guardian_father
##
## Df Sum of Sq RSS AIC
## - Fjob_at_home 1 2.02 5822.5 1127.1
## - guardian_father 1 4.71 5825.1 1127.3
## - famrel 1 6.39 5826.8 1127.4
## - traveltime 1 6.65 5827.1 1127.4
## - Fjob_services 1 6.97 5827.4 1127.4
## - guardian_mother 1 7.14 5827.6 1127.4
## - Fjob_teacher 1 9.08 5829.5 1127.6
## - internet 1 9.15 5829.6 1127.6
## - activities 1 9.38 5829.8 1127.6
## - Dalc 1 9.56 5830.0 1127.6
## - address_R 1 11.84 5832.3 1127.8
## - Fjob_other 1 13.04 5833.5 1127.8
## - reason_course 1 13.96 5834.4 1127.9
## - reason_home 1 14.40 5834.8 1127.9
## - Walc 1 17.03 5837.5 1128.1
## - paid 1 17.49 5837.9 1128.1
## - Mjob_at_home 1 24.16 5844.6 1128.6
## <none> 5820.4 1129.0
## - higher 1 29.91 5850.4 1129.0
## - health 1 31.01 5851.4 1129.0
## - Medu 1 34.57 5855.0 1129.3
## - age 1 36.10 5856.5 1129.4
## - freetime 1 49.55 5870.0 1130.3
## - famsize_GT3 1 53.87 5874.3 1130.6
## - absences 1 56.49 5876.9 1130.8
## + Fedu 1 2.33 5818.1 1130.8
## + reason_reputation 1 2.21 5818.2 1130.8
## + reason_other 1 2.21 5818.2 1130.8
## + Pstatus_A 1 2.19 5818.3 1130.8
## + nursery 1 1.89 5818.6 1130.8
## + Mjob_services 1 1.58 5818.9 1130.8
## + Mjob_health 1 1.58 5818.9 1130.8
## - romantic 1 70.13 5890.6 1131.7
## - Mjob_other 1 76.67 5897.1 1132.1
## - schoolsup 1 78.27 5898.7 1132.2
## - studytime 1 83.36 5903.8 1132.6
## - famsup 1 88.43 5908.9 1132.9
## - sex_female 1 93.05 5913.5 1133.2
## - Mjob_teacher 1 148.59 5969.0 1136.9
## - goout 1 157.08 5977.5 1137.5
## - failures 1 467.17 6287.6 1157.4
##
## Step: AIC=1127.09
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + traveltime +
## studytime + failures + schoolsup + famsup + paid + activities +
## higher + internet + romantic + famrel + freetime + goout +
## Dalc + Walc + health + absences + Mjob_other + Mjob_at_home +
## Mjob_teacher + Fjob_other + Fjob_services + Fjob_teacher +
## reason_course + reason_home + guardian_mother + guardian_father
##
## Df Sum of Sq RSS AIC
## - guardian_father 1 4.78 5827.2 1125.4
## - Fjob_services 1 5.18 5827.6 1125.4
## - traveltime 1 6.49 5828.9 1125.5
## - famrel 1 6.76 5829.2 1125.5
## - guardian_mother 1 7.52 5830.0 1125.6
## - internet 1 8.45 5830.9 1125.7
## - activities 1 9.75 5832.2 1125.8
## - Dalc 1 9.81 5832.3 1125.8
## - address_R 1 12.20 5834.7 1125.9
## - Fjob_other 1 13.34 5835.8 1126.0
## - reason_course 1 15.29 5837.7 1126.1
## - reason_home 1 16.03 5838.5 1126.2
## - Walc 1 17.71 5840.2 1126.3
## - paid 1 17.85 5840.3 1126.3
## - Fjob_teacher 1 21.04 5843.5 1126.5
## - Mjob_at_home 1 25.54 5848.0 1126.8
## <none> 5822.5 1127.1
## - higher 1 29.63 5852.1 1127.1
## - health 1 30.15 5852.6 1127.1
## - Medu 1 35.00 5857.5 1127.5
## - age 1 38.82 5861.3 1127.7
## - freetime 1 47.86 5870.3 1128.3
## - famsize_GT3 1 53.34 5875.8 1128.7
## + reason_other 1 2.34 5820.1 1128.9
## + reason_reputation 1 2.34 5820.1 1128.9
## + Pstatus_A 1 2.15 5820.3 1129.0
## + Mjob_services 1 2.14 5820.3 1129.0
## + Mjob_health 1 2.14 5820.3 1129.0
## + nursery 1 2.13 5820.3 1129.0
## - absences 1 57.27 5879.7 1129.0
## + Fjob_at_home 1 2.02 5820.4 1129.0
## + Fjob_health 1 2.02 5820.4 1129.0
## + Fedu 1 1.83 5820.6 1129.0
## - romantic 1 70.35 5892.8 1129.8
## - schoolsup 1 77.38 5899.8 1130.3
## - Mjob_other 1 78.27 5900.7 1130.3
## - studytime 1 85.40 5907.9 1130.8
## - famsup 1 88.21 5910.7 1131.0
## - sex_female 1 94.03 5916.5 1131.4
## - Mjob_teacher 1 147.35 5969.8 1134.9
## - goout 1 155.76 5978.2 1135.5
## - failures 1 465.95 6288.4 1155.4
##
## Step: AIC=1125.42
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + traveltime +
## studytime + failures + schoolsup + famsup + paid + activities +
## higher + internet + romantic + famrel + freetime + goout +
## Dalc + Walc + health + absences + Mjob_other + Mjob_at_home +
## Mjob_teacher + Fjob_other + Fjob_services + Fjob_teacher +
## reason_course + reason_home + guardian_mother
##
## Df Sum of Sq RSS AIC
## - guardian_mother 1 2.79 5830.0 1123.6
## - Fjob_services 1 4.88 5832.1 1123.8
## - traveltime 1 6.24 5833.5 1123.8
## - famrel 1 7.07 5834.3 1123.9
## - internet 1 7.85 5835.1 1124.0
## - Dalc 1 9.20 5836.4 1124.0
## - activities 1 9.74 5837.0 1124.1
## - Fjob_other 1 11.93 5839.2 1124.2
## - address_R 1 13.39 5840.6 1124.3
## - reason_course 1 15.64 5842.9 1124.5
## - reason_home 1 15.70 5842.9 1124.5
## - Walc 1 15.86 5843.1 1124.5
## - paid 1 18.52 5845.8 1124.7
## - Fjob_teacher 1 20.74 5848.0 1124.8
## - Mjob_at_home 1 25.48 5852.7 1125.1
## <none> 5827.2 1125.4
## - health 1 31.49 5858.7 1125.5
## - higher 1 32.66 5859.9 1125.6
## - Medu 1 34.04 5861.3 1125.7
## - age 1 34.10 5861.3 1125.7
## - freetime 1 50.77 5878.0 1126.8
## - famsize_GT3 1 52.32 5879.6 1126.9
## + guardian_father 1 4.78 5822.5 1127.1
## + guardian_other 1 4.78 5822.5 1127.1
## + nursery 1 3.01 5824.2 1127.2
## + Pstatus_A 1 2.84 5824.4 1127.2
## + reason_reputation 1 2.31 5824.9 1127.3
## + reason_other 1 2.31 5824.9 1127.3
## + Mjob_services 1 2.14 5825.1 1127.3
## + Mjob_health 1 2.14 5825.1 1127.3
## + Fjob_at_home 1 2.09 5825.1 1127.3
## + Fjob_health 1 2.09 5825.1 1127.3
## + Fedu 1 1.75 5825.5 1127.3
## - absences 1 62.32 5889.6 1127.6
## - romantic 1 68.23 5895.5 1128.0
## - schoolsup 1 76.93 5904.2 1128.6
## - Mjob_other 1 80.77 5908.0 1128.8
## - studytime 1 86.94 5914.2 1129.2
## - famsup 1 87.52 5914.8 1129.3
## - sex_female 1 93.37 5920.6 1129.7
## - Mjob_teacher 1 144.67 5971.9 1133.1
## - goout 1 157.69 5984.9 1133.9
## - failures 1 468.96 6296.2 1153.9
##
## Step: AIC=1123.6
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + traveltime +
## studytime + failures + schoolsup + famsup + paid + activities +
## higher + internet + romantic + famrel + freetime + goout +
## Dalc + Walc + health + absences + Mjob_other + Mjob_at_home +
## Mjob_teacher + Fjob_other + Fjob_services + Fjob_teacher +
## reason_course + reason_home
##
## Df Sum of Sq RSS AIC
## - Fjob_services 1 4.89 5834.9 1121.9
## - traveltime 1 5.44 5835.5 1122.0
## - famrel 1 7.08 5837.1 1122.1
## - internet 1 8.24 5838.3 1122.2
## - Dalc 1 9.07 5839.1 1122.2
## - activities 1 9.74 5839.8 1122.3
## - Fjob_other 1 13.49 5843.5 1122.5
## - address_R 1 15.44 5845.5 1122.7
## - reason_home 1 15.47 5845.5 1122.7
## - Walc 1 16.26 5846.3 1122.7
## - reason_course 1 16.36 5846.4 1122.7
## - paid 1 17.77 5847.8 1122.8
## - Fjob_teacher 1 20.59 5850.6 1123.0
## - Mjob_at_home 1 25.72 5855.7 1123.3
## <none> 5830.0 1123.6
## - health 1 31.01 5861.0 1123.7
## - age 1 31.91 5861.9 1123.8
## - Medu 1 33.66 5863.7 1123.9
## - higher 1 34.54 5864.6 1123.9
## - famsize_GT3 1 51.60 5881.6 1125.1
## + guardian_other 1 7.23 5822.8 1125.1
## - freetime 1 52.34 5882.4 1125.1
## + nursery 1 3.51 5826.5 1125.4
## + guardian_mother 1 2.79 5827.2 1125.4
## + Pstatus_A 1 2.51 5827.5 1125.4
## + Mjob_services 1 2.50 5827.5 1125.4
## + Mjob_health 1 2.50 5827.5 1125.4
## + Fjob_at_home 1 2.43 5827.6 1125.4
## + Fjob_health 1 2.43 5827.6 1125.4
## + reason_other 1 2.11 5827.9 1125.5
## + reason_reputation 1 2.11 5827.9 1125.5
## + Fedu 1 1.05 5829.0 1125.5
## + guardian_father 1 0.05 5830.0 1125.6
## - absences 1 61.15 5891.2 1125.7
## - romantic 1 68.58 5898.6 1126.2
## - schoolsup 1 76.57 5906.6 1126.7
## - Mjob_other 1 78.92 5908.9 1126.9
## - famsup 1 85.91 5915.9 1127.4
## - studytime 1 88.46 5918.5 1127.5
## - sex_female 1 94.98 5925.0 1128.0
## - Mjob_teacher 1 151.59 5981.6 1131.7
## - goout 1 165.96 5996.0 1132.7
## - failures 1 466.25 6296.3 1151.9
##
## Step: AIC=1121.93
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + traveltime +
## studytime + failures + schoolsup + famsup + paid + activities +
## higher + internet + romantic + famrel + freetime + goout +
## Dalc + Walc + health + absences + Mjob_other + Mjob_at_home +
## Mjob_teacher + Fjob_other + Fjob_teacher + reason_course +
## reason_home
##
## Df Sum of Sq RSS AIC
## - famrel 1 5.99 5840.9 1120.3
## - traveltime 1 6.27 5841.2 1120.4
## - internet 1 7.74 5842.6 1120.5
## - Dalc 1 8.96 5843.9 1120.5
## - Fjob_other 1 9.79 5844.7 1120.6
## - activities 1 10.61 5845.5 1120.7
## - Walc 1 14.27 5849.2 1120.9
## - address_R 1 15.49 5850.4 1121.0
## - reason_home 1 16.10 5851.0 1121.0
## - reason_course 1 17.69 5852.6 1121.1
## - paid 1 17.88 5852.8 1121.1
## - Mjob_at_home 1 23.54 5858.5 1121.5
## - health 1 28.87 5863.8 1121.9
## <none> 5834.9 1121.9
## - age 1 31.51 5866.4 1122.1
## - Medu 1 34.88 5869.8 1122.3
## - higher 1 35.97 5870.9 1122.4
## - Fjob_teacher 1 46.49 5881.4 1123.1
## + guardian_other 1 6.90 5828.0 1123.5
## + Fjob_health 1 6.80 5828.1 1123.5
## + Fjob_services 1 4.89 5830.0 1123.6
## - freetime 1 55.28 5890.2 1123.7
## - famsize_GT3 1 55.39 5890.3 1123.7
## + nursery 1 3.09 5831.8 1123.7
## + Pstatus_A 1 2.99 5831.9 1123.7
## + Mjob_services 1 2.94 5832.0 1123.7
## + Mjob_health 1 2.94 5832.0 1123.7
## + guardian_mother 1 2.79 5832.1 1123.8
## + reason_reputation 1 1.93 5833.0 1123.8
## + reason_other 1 1.93 5833.0 1123.8
## + Fedu 1 0.78 5834.1 1123.9
## + Fjob_at_home 1 0.11 5834.8 1123.9
## + guardian_father 1 0.07 5834.8 1123.9
## - absences 1 60.58 5895.5 1124.0
## - romantic 1 67.90 5902.8 1124.5
## - schoolsup 1 74.30 5909.2 1124.9
## - Mjob_other 1 77.51 5912.4 1125.1
## - famsup 1 83.67 5918.6 1125.5
## - studytime 1 90.03 5924.9 1126.0
## - sex_female 1 94.83 5929.7 1126.3
## - Mjob_teacher 1 151.26 5986.2 1130.0
## - goout 1 164.29 5999.2 1130.9
## - failures 1 466.14 6301.1 1150.2
##
## Step: AIC=1120.34
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + traveltime +
## studytime + failures + schoolsup + famsup + paid + activities +
## higher + internet + romantic + freetime + goout + Dalc +
## Walc + health + absences + Mjob_other + Mjob_at_home + Mjob_teacher +
## Fjob_other + Fjob_teacher + reason_course + reason_home
##
## Df Sum of Sq RSS AIC
## - traveltime 1 6.04 5846.9 1118.8
## - internet 1 8.35 5849.3 1118.9
## - Dalc 1 9.60 5850.5 1119.0
## - Fjob_other 1 10.35 5851.3 1119.0
## - activities 1 10.50 5851.4 1119.0
## - Walc 1 12.23 5853.1 1119.2
## - address_R 1 15.42 5856.3 1119.4
## - reason_home 1 16.14 5857.0 1119.4
## - reason_course 1 17.64 5858.5 1119.5
## - paid 1 18.65 5859.6 1119.6
## - Mjob_at_home 1 23.24 5864.1 1119.9
## - health 1 26.27 5867.2 1120.1
## - age 1 29.07 5870.0 1120.3
## <none> 5840.9 1120.3
## - Medu 1 35.43 5876.3 1120.7
## - higher 1 36.61 5877.5 1120.8
## - Fjob_teacher 1 44.11 5885.0 1121.3
## + guardian_other 1 7.26 5833.6 1121.8
## + Fjob_health 1 6.45 5834.5 1121.9
## + famrel 1 5.99 5834.9 1121.9
## - famsize_GT3 1 55.15 5896.1 1122.0
## + Fjob_services 1 3.80 5837.1 1122.1
## + nursery 1 3.22 5837.7 1122.1
## + Pstatus_A 1 3.00 5837.9 1122.1
## + guardian_mother 1 2.80 5838.1 1122.2
## + Mjob_services 1 2.22 5838.7 1122.2
## + Mjob_health 1 2.22 5838.7 1122.2
## + reason_reputation 1 1.97 5838.9 1122.2
## + reason_other 1 1.97 5838.9 1122.2
## + Fedu 1 0.69 5840.2 1122.3
## + guardian_father 1 0.05 5840.9 1122.3
## + Fjob_at_home 1 0.00 5840.9 1122.3
## - absences 1 60.16 5901.1 1122.4
## - freetime 1 61.82 5902.7 1122.5
## - romantic 1 71.07 5912.0 1123.1
## - schoolsup 1 73.07 5914.0 1123.2
## - Mjob_other 1 76.73 5917.6 1123.5
## - famsup 1 85.41 5926.3 1124.1
## - studytime 1 91.30 5932.2 1124.5
## - sex_female 1 98.89 5939.8 1125.0
## - Mjob_teacher 1 153.89 5994.8 1128.6
## - goout 1 160.29 6001.2 1129.0
## - failures 1 473.79 6314.7 1149.1
##
## Step: AIC=1118.75
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + studytime +
## failures + schoolsup + famsup + paid + activities + higher +
## internet + romantic + freetime + goout + Dalc + Walc + health +
## absences + Mjob_other + Mjob_at_home + Mjob_teacher + Fjob_other +
## Fjob_teacher + reason_course + reason_home
##
## Df Sum of Sq RSS AIC
## - internet 1 8.45 5855.4 1117.3
## - activities 1 10.84 5857.8 1117.5
## - Dalc 1 10.94 5857.9 1117.5
## - Fjob_other 1 11.91 5858.8 1117.5
## - Walc 1 12.08 5859.0 1117.6
## - reason_home 1 15.96 5862.9 1117.8
## - reason_course 1 18.93 5865.9 1118.0
## - paid 1 19.13 5866.1 1118.0
## - address_R 1 23.14 5870.1 1118.3
## - Mjob_at_home 1 24.55 5871.5 1118.4
## - health 1 25.45 5872.4 1118.5
## - age 1 28.66 5875.6 1118.7
## <none> 5846.9 1118.8
## - higher 1 37.34 5884.3 1119.2
## - Medu 1 38.34 5885.3 1119.3
## - Fjob_teacher 1 42.01 5889.0 1119.6
## - famsize_GT3 1 52.67 5899.6 1120.3
## + Fjob_health 1 6.76 5840.2 1120.3
## + guardian_other 1 6.34 5840.6 1120.3
## + traveltime 1 6.04 5840.9 1120.3
## + famrel 1 5.75 5841.2 1120.4
## + Fjob_services 1 4.54 5842.4 1120.4
## + nursery 1 3.46 5843.5 1120.5
## + Pstatus_A 1 3.15 5843.8 1120.5
## + Mjob_services 1 2.42 5844.5 1120.6
## + Mjob_health 1 2.42 5844.5 1120.6
## + reason_reputation 1 2.08 5844.9 1120.6
## + reason_other 1 2.08 5844.9 1120.6
## + guardian_mother 1 1.96 5845.0 1120.6
## + Fedu 1 0.35 5846.6 1120.7
## + Fjob_at_home 1 0.06 5846.9 1120.7
## + guardian_father 1 0.00 5846.9 1120.8
## - absences 1 61.21 5908.2 1120.8
## - freetime 1 64.43 5911.4 1121.1
## - schoolsup 1 72.77 5919.7 1121.6
## - romantic 1 72.78 5919.7 1121.6
## - Mjob_other 1 77.43 5924.4 1121.9
## - famsup 1 88.71 5935.6 1122.7
## - studytime 1 94.65 5941.6 1123.1
## - sex_female 1 98.07 5945.0 1123.3
## - Mjob_teacher 1 154.85 6001.8 1127.0
## - goout 1 162.05 6009.0 1127.5
## - failures 1 476.33 6323.3 1147.6
##
## Step: AIC=1117.31
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + studytime +
## failures + schoolsup + famsup + paid + activities + higher +
## romantic + freetime + goout + Dalc + Walc + health + absences +
## Mjob_other + Mjob_at_home + Mjob_teacher + Fjob_other + Fjob_teacher +
## reason_course + reason_home
##
## Df Sum of Sq RSS AIC
## - Dalc 1 10.35 5865.7 1116.0
## - activities 1 10.53 5865.9 1116.0
## - Walc 1 11.81 5867.2 1116.1
## - Fjob_other 1 12.72 5868.1 1116.2
## - reason_home 1 15.70 5871.1 1116.4
## - reason_course 1 18.45 5873.8 1116.5
## - paid 1 21.57 5877.0 1116.8
## - address_R 1 28.55 5883.9 1117.2
## - health 1 29.06 5884.5 1117.3
## <none> 5855.4 1117.3
## - Mjob_at_home 1 30.97 5886.4 1117.4
## - age 1 32.63 5888.0 1117.5
## - higher 1 35.73 5891.1 1117.7
## - Medu 1 38.22 5893.6 1117.9
## - Fjob_teacher 1 39.44 5894.8 1118.0
## - famsize_GT3 1 51.13 5906.5 1118.7
## + internet 1 8.45 5846.9 1118.8
## + famrel 1 6.36 5849.0 1118.9
## + traveltime 1 6.13 5849.3 1118.9
## + guardian_other 1 5.97 5849.4 1118.9
## + Fjob_health 1 5.31 5850.1 1119.0
## + nursery 1 4.15 5851.2 1119.0
## + Fjob_services 1 3.98 5851.4 1119.0
## + Mjob_services 1 2.63 5852.8 1119.1
## + Mjob_health 1 2.63 5852.8 1119.1
## + guardian_mother 1 2.29 5853.1 1119.2
## + Pstatus_A 1 2.14 5853.3 1119.2
## + reason_reputation 1 1.90 5853.5 1119.2
## + reason_other 1 1.90 5853.5 1119.2
## + Fedu 1 0.20 5855.2 1119.3
## + Fjob_at_home 1 0.12 5855.3 1119.3
## + guardian_father 1 0.04 5855.4 1119.3
## - freetime 1 65.19 5920.6 1119.7
## - absences 1 66.92 5922.3 1119.8
## - romantic 1 68.49 5923.9 1119.9
## - schoolsup 1 73.70 5929.1 1120.2
## - Mjob_other 1 81.80 5937.2 1120.8
## - famsup 1 86.54 5941.9 1121.1
## - studytime 1 98.70 5954.1 1121.9
## - sex_female 1 101.46 5956.9 1122.1
## - Mjob_teacher 1 152.52 6007.9 1125.5
## - goout 1 158.70 6014.1 1125.8
## - failures 1 479.01 6334.4 1146.3
##
## Step: AIC=1116.01
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + studytime +
## failures + schoolsup + famsup + paid + activities + higher +
## romantic + freetime + goout + Walc + health + absences +
## Mjob_other + Mjob_at_home + Mjob_teacher + Fjob_other + Fjob_teacher +
## reason_course + reason_home
##
## Df Sum of Sq RSS AIC
## - Walc 1 3.85 5869.6 1114.3
## - activities 1 8.91 5874.7 1114.6
## - Fjob_other 1 9.46 5875.2 1114.7
## - reason_home 1 16.01 5881.8 1115.1
## - reason_course 1 18.25 5884.0 1115.2
## - paid 1 19.83 5885.6 1115.3
## - health 1 29.82 5895.6 1116.0
## <none> 5865.7 1116.0
## - address_R 1 30.55 5896.3 1116.1
## - Mjob_at_home 1 32.14 5897.9 1116.2
## - higher 1 34.29 5900.0 1116.3
## - Medu 1 35.42 5901.2 1116.4
## - age 1 36.27 5902.0 1116.4
## - Fjob_teacher 1 39.02 5904.8 1116.6
## - famsize_GT3 1 48.45 5914.2 1117.2
## + Dalc 1 10.35 5855.4 1117.3
## + internet 1 7.87 5857.9 1117.5
## + traveltime 1 7.44 5858.3 1117.5
## + famrel 1 7.00 5858.7 1117.5
## + Fjob_health 1 5.58 5860.2 1117.6
## + guardian_other 1 5.18 5860.6 1117.7
## + Mjob_services 1 3.95 5861.8 1117.7
## + Mjob_health 1 3.95 5861.8 1117.7
## + Fjob_services 1 3.92 5861.8 1117.8
## + nursery 1 3.43 5862.3 1117.8
## + guardian_mother 1 2.06 5863.7 1117.9
## + Pstatus_A 1 1.89 5863.9 1117.9
## - freetime 1 58.93 5924.7 1118.0
## + reason_reputation 1 0.89 5864.9 1118.0
## + reason_other 1 0.89 5864.9 1118.0
## + Fjob_at_home 1 0.08 5865.7 1118.0
## + Fedu 1 0.06 5865.7 1118.0
## + guardian_father 1 0.05 5865.7 1118.0
## - absences 1 66.03 5931.8 1118.4
## - romantic 1 68.87 5934.6 1118.6
## - schoolsup 1 78.68 5944.4 1119.3
## - famsup 1 88.47 5954.2 1119.9
## - Mjob_other 1 89.01 5954.8 1119.9
## - sex_female 1 95.41 5961.2 1120.4
## - studytime 1 99.49 5965.2 1120.6
## - Mjob_teacher 1 152.40 6018.1 1124.1
## - goout 1 155.79 6021.5 1124.3
## - failures 1 487.36 6353.1 1145.5
##
## Step: AIC=1114.27
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + studytime +
## failures + schoolsup + famsup + paid + activities + higher +
## romantic + freetime + goout + health + absences + Mjob_other +
## Mjob_at_home + Mjob_teacher + Fjob_other + Fjob_teacher +
## reason_course + reason_home
##
## Df Sum of Sq RSS AIC
## - Fjob_other 1 9.70 5879.3 1112.9
## - activities 1 9.85 5879.5 1112.9
## - reason_home 1 16.62 5886.2 1113.4
## - reason_course 1 18.86 5888.5 1113.5
## - paid 1 23.05 5892.6 1113.8
## - health 1 28.20 5897.8 1114.2
## - address_R 1 28.26 5897.9 1114.2
## <none> 5869.6 1114.3
## - Mjob_at_home 1 31.19 5900.8 1114.4
## - higher 1 34.32 5903.9 1114.6
## - Medu 1 34.62 5904.2 1114.6
## - age 1 36.27 5905.9 1114.7
## - Fjob_teacher 1 37.20 5906.8 1114.8
## - famsize_GT3 1 50.58 5920.2 1115.7
## + internet 1 7.92 5861.7 1115.7
## + traveltime 1 6.72 5862.9 1115.8
## + famrel 1 5.08 5864.5 1115.9
## + Fjob_health 1 5.05 5864.5 1115.9
## + guardian_other 1 4.48 5865.1 1116.0
## + nursery 1 4.22 5865.4 1116.0
## + Mjob_services 1 4.11 5865.5 1116.0
## + Mjob_health 1 4.11 5865.5 1116.0
## + Walc 1 3.85 5865.7 1116.0
## + Fjob_services 1 2.86 5866.7 1116.1
## + Dalc 1 2.40 5867.2 1116.1
## + guardian_mother 1 2.36 5867.2 1116.1
## + Pstatus_A 1 1.84 5867.8 1116.2
## - freetime 1 58.20 5927.8 1116.2
## + reason_reputation 1 1.16 5868.4 1116.2
## + reason_other 1 1.16 5868.4 1116.2
## + guardian_father 1 0.17 5869.4 1116.3
## + Fedu 1 0.02 5869.6 1116.3
## + Fjob_at_home 1 0.00 5869.6 1116.3
## - romantic 1 69.17 5938.8 1116.9
## - absences 1 71.89 5941.5 1117.1
## - schoolsup 1 79.50 5949.1 1117.6
## - Mjob_other 1 89.50 5959.1 1118.2
## - famsup 1 90.59 5960.2 1118.3
## - studytime 1 95.68 5965.3 1118.6
## - sex_female 1 109.74 5979.3 1119.6
## - Mjob_teacher 1 151.73 6021.3 1122.3
## - goout 1 164.72 6034.3 1123.2
## - failures 1 484.44 6354.0 1143.5
##
## Step: AIC=1112.92
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + studytime +
## failures + schoolsup + famsup + paid + activities + higher +
## romantic + freetime + goout + health + absences + Mjob_other +
## Mjob_at_home + Mjob_teacher + Fjob_teacher + reason_course +
## reason_home
##
## Df Sum of Sq RSS AIC
## - activities 1 9.10 5888.4 1111.5
## - reason_home 1 17.10 5896.4 1112.1
## - reason_course 1 19.98 5899.3 1112.3
## - paid 1 22.20 5901.5 1112.4
## - health 1 28.78 5908.1 1112.8
## <none> 5879.3 1112.9
## - address_R 1 30.15 5909.4 1112.9
## - Medu 1 32.04 5911.3 1113.1
## - higher 1 32.48 5911.8 1113.1
## - Mjob_at_home 1 35.75 5915.0 1113.3
## - age 1 35.86 5915.2 1113.3
## - famsize_GT3 1 49.30 5928.6 1114.2
## + Fjob_other 1 9.70 5869.6 1114.3
## + Fjob_health 1 8.76 5870.5 1114.3
## + internet 1 8.71 5870.6 1114.3
## + traveltime 1 7.96 5871.3 1114.4
## - Fjob_teacher 1 53.85 5933.1 1114.5
## + famrel 1 5.44 5873.9 1114.6
## + guardian_mother 1 4.33 5875.0 1114.6
## + Walc 1 4.10 5875.2 1114.7
## + guardian_other 1 4.09 5875.2 1114.7
## + Mjob_services 1 3.98 5875.3 1114.7
## + Mjob_health 1 3.98 5875.3 1114.7
## - freetime 1 56.27 5935.6 1114.7
## + nursery 1 3.57 5875.7 1114.7
## + Fjob_services 1 1.96 5877.3 1114.8
## + reason_reputation 1 1.77 5877.5 1114.8
## + reason_other 1 1.77 5877.5 1114.8
## + Pstatus_A 1 1.71 5877.6 1114.8
## + guardian_father 1 1.17 5878.1 1114.8
## + Dalc 1 1.15 5878.2 1114.8
## + Fjob_at_home 1 0.95 5878.3 1114.9
## + Fedu 1 0.28 5879.0 1114.9
## - romantic 1 66.30 5945.6 1115.3
## - absences 1 71.23 5950.5 1115.7
## - schoolsup 1 77.08 5956.4 1116.0
## - famsup 1 92.64 5971.9 1117.1
## - studytime 1 99.49 5978.8 1117.5
## - sex_female 1 107.49 5986.8 1118.1
## - Mjob_other 1 114.51 5993.8 1118.5
## - Mjob_teacher 1 148.42 6027.7 1120.7
## - goout 1 167.42 6046.7 1122.0
## - failures 1 478.65 6358.0 1141.8
##
## Step: AIC=1111.53
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + studytime +
## failures + schoolsup + famsup + paid + higher + romantic +
## freetime + goout + health + absences + Mjob_other + Mjob_at_home +
## Mjob_teacher + Fjob_teacher + reason_course + reason_home
##
## Df Sum of Sq RSS AIC
## - reason_home 1 15.32 5903.7 1110.5
## - reason_course 1 17.33 5905.7 1110.7
## - paid 1 24.46 5912.9 1111.2
## - health 1 29.39 5917.8 1111.5
## - higher 1 29.88 5918.3 1111.5
## <none> 5888.4 1111.5
## - Medu 1 31.47 5919.9 1111.6
## - address_R 1 32.61 5921.0 1111.7
## - age 1 33.33 5921.7 1111.8
## - Mjob_at_home 1 35.37 5923.8 1111.9
## - famsize_GT3 1 50.09 5938.5 1112.9
## + Fjob_health 1 9.71 5878.7 1112.9
## + activities 1 9.10 5879.3 1112.9
## + Fjob_other 1 8.95 5879.5 1112.9
## + internet 1 8.43 5880.0 1113.0
## + traveltime 1 8.06 5880.3 1113.0
## - freetime 1 53.81 5942.2 1113.1
## + famrel 1 5.08 5883.3 1113.2
## + Walc 1 5.02 5883.4 1113.2
## + Mjob_services 1 4.29 5884.1 1113.2
## + Mjob_health 1 4.29 5884.1 1113.2
## + guardian_mother 1 4.28 5884.1 1113.2
## + guardian_other 1 4.04 5884.4 1113.3
## - Fjob_teacher 1 56.86 5945.3 1113.3
## + nursery 1 3.17 5885.2 1113.3
## + Pstatus_A 1 2.58 5885.8 1113.4
## + reason_reputation 1 2.32 5886.1 1113.4
## + reason_other 1 2.32 5886.1 1113.4
## + Fjob_services 1 1.46 5886.9 1113.4
## + guardian_father 1 1.15 5887.2 1113.5
## + Fjob_at_home 1 0.91 5887.5 1113.5
## + Dalc 1 0.56 5887.8 1113.5
## + Fedu 1 0.11 5888.3 1113.5
## - romantic 1 69.49 5957.9 1114.2
## - absences 1 71.07 5959.5 1114.3
## - schoolsup 1 80.63 5969.0 1114.9
## - famsup 1 91.25 5979.7 1115.6
## - studytime 1 94.52 5982.9 1115.8
## - sex_female 1 101.92 5990.3 1116.3
## - Mjob_other 1 110.39 5998.8 1116.8
## - Mjob_teacher 1 153.87 6042.3 1119.7
## - goout 1 172.60 6061.0 1120.9
## - failures 1 475.49 6363.9 1140.1
##
## Step: AIC=1110.55
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + studytime +
## failures + schoolsup + famsup + paid + higher + romantic +
## freetime + goout + health + absences + Mjob_other + Mjob_at_home +
## Mjob_teacher + Fjob_teacher + reason_course
##
## Df Sum of Sq RSS AIC
## - reason_course 1 7.00 5910.7 1109.0
## - paid 1 23.76 5927.5 1110.1
## - address_R 1 25.86 5929.6 1110.3
## - higher 1 26.25 5930.0 1110.3
## <none> 5903.7 1110.5
## - health 1 32.73 5936.4 1110.7
## - Medu 1 34.07 5937.8 1110.8
## - age 1 34.55 5938.3 1110.8
## - Mjob_at_home 1 38.40 5942.1 1111.1
## + reason_home 1 15.32 5888.4 1111.5
## + Fjob_health 1 12.51 5891.2 1111.7
## + Fjob_other 1 9.46 5894.3 1111.9
## - famsize_GT3 1 51.26 5955.0 1112.0
## + internet 1 8.21 5895.5 1112.0
## + reason_other 1 8.10 5895.6 1112.0
## + traveltime 1 7.85 5895.9 1112.0
## + activities 1 7.32 5896.4 1112.1
## + Walc 1 5.59 5898.1 1112.2
## + famrel 1 5.04 5898.7 1112.2
## + Mjob_services 1 4.86 5898.9 1112.2
## + Mjob_health 1 4.86 5898.9 1112.2
## + guardian_mother 1 4.09 5899.6 1112.3
## + reason_reputation 1 3.73 5900.0 1112.3
## + guardian_other 1 3.60 5900.1 1112.3
## + nursery 1 3.30 5900.4 1112.3
## - Fjob_teacher 1 57.70 5961.4 1112.4
## + Pstatus_A 1 2.23 5901.5 1112.4
## + Fjob_services 1 1.47 5902.2 1112.5
## + guardian_father 1 1.19 5902.5 1112.5
## - freetime 1 59.27 5963.0 1112.5
## + Dalc 1 0.53 5903.2 1112.5
## + Fjob_at_home 1 0.49 5903.2 1112.5
## + Fedu 1 0.07 5903.6 1112.5
## - absences 1 68.55 5972.3 1113.1
## - romantic 1 70.22 5973.9 1113.2
## - schoolsup 1 81.09 5984.8 1113.9
## - famsup 1 91.44 5995.2 1114.6
## - sex_female 1 95.76 5999.5 1114.9
## - studytime 1 99.49 6003.2 1115.1
## - Mjob_other 1 119.86 6023.6 1116.5
## - Mjob_teacher 1 164.66 6068.4 1119.4
## - goout 1 175.13 6078.8 1120.1
## - failures 1 491.27 6395.0 1140.0
##
## Step: AIC=1109.02
## G3 ~ sex_female + age + address_R + famsize_GT3 + Medu + studytime +
## failures + schoolsup + famsup + paid + higher + romantic +
## freetime + goout + health + absences + Mjob_other + Mjob_at_home +
## Mjob_teacher + Fjob_teacher
##
## Df Sum of Sq RSS AIC
## - address_R 1 26.63 5937.4 1108.8
## - higher 1 26.83 5937.5 1108.8
## - paid 1 28.54 5939.3 1108.9
## <none> 5910.7 1109.0
## - age 1 34.65 5945.4 1109.3
## - Medu 1 34.82 5945.5 1109.3
## - health 1 37.52 5948.2 1109.5
## - Mjob_at_home 1 42.95 5953.7 1109.9
## + Fjob_health 1 14.09 5896.6 1110.1
## + reason_other 1 11.87 5898.9 1110.2
## + Fjob_other 1 10.09 5900.6 1110.3
## - famsize_GT3 1 50.56 5961.3 1110.4
## + traveltime 1 8.88 5901.8 1110.4
## + reason_reputation 1 8.33 5902.4 1110.5
## + internet 1 8.02 5902.7 1110.5
## + reason_course 1 7.00 5903.7 1110.5
## + activities 1 6.12 5904.6 1110.6
## + Mjob_services 1 5.97 5904.8 1110.6
## + Mjob_health 1 5.97 5904.8 1110.6
## + Walc 1 5.77 5905.0 1110.6
## + reason_home 1 4.99 5905.7 1110.7
## + famrel 1 4.96 5905.8 1110.7
## + guardian_mother 1 4.82 5905.9 1110.7
## + guardian_other 1 4.12 5906.6 1110.8
## + nursery 1 3.03 5907.7 1110.8
## - freetime 1 57.33 5968.1 1110.8
## + Pstatus_A 1 2.07 5908.7 1110.9
## - Fjob_teacher 1 58.36 5969.1 1110.9
## + guardian_father 1 1.44 5909.3 1110.9
## + Fjob_services 1 1.44 5909.3 1110.9
## + Fjob_at_home 1 0.53 5910.2 1111.0
## + Dalc 1 0.46 5910.3 1111.0
## + Fedu 1 0.06 5910.7 1111.0
## - romantic 1 68.84 5979.6 1111.6
## - absences 1 77.06 5987.8 1112.1
## - schoolsup 1 81.30 5992.0 1112.4
## - famsup 1 91.53 6002.3 1113.1
## - sex_female 1 97.68 6008.4 1113.5
## - studytime 1 102.88 6013.6 1113.8
## - Mjob_other 1 120.38 6031.1 1115.0
## - Mjob_teacher 1 174.27 6085.0 1118.5
## - goout 1 180.28 6091.0 1118.9
## - failures 1 489.37 6400.1 1138.4
##
## Step: AIC=1108.79
## G3 ~ sex_female + age + famsize_GT3 + Medu + studytime + failures +
## schoolsup + famsup + paid + higher + romantic + freetime +
## goout + health + absences + Mjob_other + Mjob_at_home + Mjob_teacher +
## Fjob_teacher
##
## Df Sum of Sq RSS AIC
## - higher 1 25.42 5962.8 1108.5
## - paid 1 29.69 5967.0 1108.8
## <none> 5937.4 1108.8
## + address_R 1 26.63 5910.7 1109.0
## - Medu 1 37.82 5975.2 1109.3
## + traveltime 1 19.63 5917.7 1109.5
## - health 1 40.78 5978.1 1109.5
## - age 1 43.34 5980.7 1109.7
## + Fjob_health 1 15.56 5921.8 1109.8
## + internet 1 13.28 5924.1 1109.9
## + Fjob_other 1 11.72 5925.6 1110.0
## + reason_other 1 10.02 5927.3 1110.1
## + activities 1 8.31 5929.0 1110.2
## + guardian_mother 1 8.03 5929.3 1110.3
## + reason_course 1 7.78 5929.6 1110.3
## - Mjob_at_home 1 52.73 5990.1 1110.3
## + guardian_other 1 6.98 5930.4 1110.3
## + Mjob_services 1 6.62 5930.7 1110.3
## + Mjob_health 1 6.62 5930.7 1110.3
## + famrel 1 5.50 5931.9 1110.4
## + reason_reputation 1 5.17 5932.2 1110.5
## - famsize_GT3 1 56.04 5993.4 1110.5
## - Fjob_teacher 1 57.23 5994.6 1110.6
## + Walc 1 2.89 5934.5 1110.6
## + nursery 1 2.76 5934.6 1110.6
## + guardian_father 1 2.40 5935.0 1110.6
## + Pstatus_A 1 2.29 5935.1 1110.6
## - freetime 1 58.73 5996.1 1110.7
## + reason_home 1 1.70 5935.7 1110.7
## + Fjob_services 1 1.63 5935.7 1110.7
## + Dalc 1 1.47 5935.9 1110.7
## + Fjob_at_home 1 0.84 5936.5 1110.7
## + Fedu 1 0.03 5937.3 1110.8
## - romantic 1 66.56 6003.9 1111.2
## - absences 1 74.32 6011.7 1111.7
## - schoolsup 1 81.43 6018.8 1112.2
## - sex_female 1 91.13 6028.5 1112.8
## - famsup 1 93.03 6030.4 1112.9
## - studytime 1 98.49 6035.9 1113.3
## - Mjob_other 1 128.02 6065.4 1115.2
## - goout 1 171.06 6108.4 1118.0
## - Mjob_teacher 1 180.39 6117.7 1118.6
## - failures 1 500.46 6437.8 1138.7
##
## Step: AIC=1108.47
## G3 ~ sex_female + age + famsize_GT3 + Medu + studytime + failures +
## schoolsup + famsup + paid + romantic + freetime + goout +
## health + absences + Mjob_other + Mjob_at_home + Mjob_teacher +
## Fjob_teacher
##
## Df Sum of Sq RSS AIC
## <none> 5962.8 1108.5
## + higher 1 25.42 5937.4 1108.8
## + address_R 1 25.23 5937.5 1108.8
## - paid 1 36.04 5998.8 1108.8
## - health 1 39.34 6002.1 1109.1
## + traveltime 1 20.02 5942.8 1109.2
## - Medu 1 41.32 6004.1 1109.2
## + Fjob_health 1 15.50 5947.3 1109.5
## + internet 1 11.41 5951.4 1109.7
## + guardian_other 1 10.92 5951.9 1109.8
## + guardian_mother 1 10.08 5952.7 1109.8
## + Fjob_other 1 9.93 5952.8 1109.8
## - age 1 52.33 6015.1 1109.9
## + reason_course 1 8.34 5954.4 1109.9
## + Mjob_services 1 7.03 5955.7 1110.0
## + Mjob_health 1 7.03 5955.7 1110.0
## + reason_other 1 6.27 5956.5 1110.1
## + activities 1 6.10 5956.7 1110.1
## + famrel 1 5.90 5956.9 1110.1
## + reason_reputation 1 5.54 5957.2 1110.1
## - Fjob_teacher 1 57.06 6019.8 1110.2
## + nursery 1 2.99 5959.8 1110.3
## - famsize_GT3 1 57.85 6020.6 1110.3
## + Pstatus_A 1 2.82 5960.0 1110.3
## + Walc 1 2.80 5960.0 1110.3
## - freetime 1 58.36 6021.1 1110.3
## + guardian_father 1 2.32 5960.5 1110.3
## - Mjob_at_home 1 59.56 6022.3 1110.4
## + Dalc 1 1.25 5961.5 1110.4
## + Fjob_at_home 1 1.12 5961.7 1110.4
## + Fjob_services 1 0.85 5961.9 1110.4
## + reason_home 1 0.75 5962.0 1110.4
## + Fedu 1 0.32 5962.5 1110.5
## - absences 1 71.94 6034.7 1111.2
## - romantic 1 74.59 6037.4 1111.4
## - sex_female 1 80.11 6042.9 1111.7
## - schoolsup 1 81.23 6044.0 1111.8
## - famsup 1 93.22 6056.0 1112.6
## - studytime 1 107.94 6070.7 1113.5
## - Mjob_other 1 127.29 6090.1 1114.8
## - goout 1 169.77 6132.5 1117.5
## - Mjob_teacher 1 182.46 6145.2 1118.3
## - failures 1 568.93 6531.7 1142.4
After 21 iterations of the stepwise regression, we now achieve the lowest AIC of 1108.47. This reduces the predictor variables to 18. It can be seen that predictor variables with high p-values were removed in the final reduced model.
Final Reduced Linear Model
We then create a reduced linear model using these 18 predictor variables. But before that, we need to scale the predictor variables to remove the units and to better understand the relationship between them.
options(scipen=100)
df_scale = data.frame(scale(df3, center = TRUE, scale = TRUE))
fit_scale <- lm(G3 ~ sex_female + age + famsize_GT3 + Medu + studytime + failures +
schoolsup + famsup + paid + romantic + freetime + goout +
health + absences + Mjob_other + Mjob_at_home + Mjob_teacher +
Fjob_teacher, data=df_scale)
summary(fit_scale)##
## Call:
## lm(formula = G3 ~ sex_female + age + famsize_GT3 + Medu + studytime +
## failures + schoolsup + famsup + paid + romantic + freetime +
## goout + health + absences + Mjob_other + Mjob_at_home + Mjob_teacher +
## Fjob_teacher, data = df_scale)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.46201 -0.39955 0.07165 0.60959 2.05454
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) -0.0000000000000004092 0.0440839847349563108 0.000
## sex_female -0.1120971573628584544 0.0499405898054334635 -2.245
## age -0.0901526734395672880 0.0496938287436624743 -1.814
## famsize_GT3 -0.0866169143316576351 0.0454121139180680769 -1.907
## Medu 0.0983911235671245527 0.0610329835270018370 1.612
## studytime 0.1253650227881584756 0.0481168246599422908 2.605
## failures -0.2925424299202841749 0.0489066420764457180 -5.982
## schoolsup -0.1062911534787372436 0.0470282904539210608 -2.260
## famsup -0.1174778471721356771 0.0485180713392184271 -2.421
## paid 0.0729387111137137778 0.0484456634719963031 1.506
## romantic -0.0999060708828780342 0.0461273005402431727 -2.166
## freetime 0.0922749803178389699 0.0481666281264517063 1.916
## goout -0.1541718489048080065 0.0471824218926186478 -3.268
## health -0.0719486900316635980 0.0457441237695690589 -1.573
## absences 0.0987090519660819704 0.0464059776718994607 2.127
## Mjob_other -0.1542981400543800841 0.0545342819733972278 -2.829
## Mjob_at_home -0.1090795021835916090 0.0563604725016343538 -1.935
## Mjob_teacher -0.1800358290216341473 0.0531472574487146590 -3.387
## Fjob_teacher 0.0890607867086985333 0.0470144105610392424 1.894
## Pr(>|t|)
## (Intercept) 1.00000
## sex_female 0.02538 *
## age 0.07045 .
## famsize_GT3 0.05724 .
## Medu 0.10778
## studytime 0.00954 **
## failures 0.00000000516 ***
## schoolsup 0.02438 *
## famsup 0.01594 *
## paid 0.13302
## romantic 0.03095 *
## freetime 0.05616 .
## goout 0.00118 **
## health 0.11660
## absences 0.03407 *
## Mjob_other 0.00491 **
## Mjob_at_home 0.05369 .
## Mjob_teacher 0.00078 ***
## Fjob_teacher 0.05895 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.875 on 375 degrees of freedom
## Multiple R-squared: 0.2694, Adjusted R-squared: 0.2343
## F-statistic: 7.681 on 18 and 375 DF, p-value: < 0.00000000000000022
With final reduced linear model with scaled values, it can be seen that failures and Mjob_teacher with 3 stars (***) are the most impactful predictor variables. Failures also had the lowest coefficient value which means that this factor has a higher impact on the mean of G3 scores by pulling it down (as indicated by its negative coefficient). Studytime has the highest coefficient value, countering failures’ lowest. The p-value for the linear model is also relatively small.
Here’s the diagnostic plot of the reduced model.
plot(fit_scale)Here are a few observations. In the Residuals vs Fitted plot, you can see that the line is almost horizontal indicating that the residuals follow a linear pattern. For Normal Q-Q plot, we can see that the data is normally distribute along Theoretical Quanticals of -1 to 1.5. Data below -1 form a curve away from the regression line indicating that it’s not normally distributed especially if it reaches around -2.