library(dplyr)
library(ggplot2)
library(haven)
library(polycor)
library(corrplot)
library(tidyverse)
library(psych)
library(kableExtra)
library(sjPlot)
library(cowplot)
library(polycor)
library(ggcorrplot)
library(car)
path = file.path("C:/Users/Zver/Documents/R projects", "data analysis 2021", "BSGCANM6.sav")
In this paper, I conduct the factor analysis of the variables related to the students’ perception of maths (as a subject, requiring its learning and studying during the lessons) to use the resulting variables in the prediction of their math achievements. The underlying assumption behind this workflow is that the factors derived from the 24 closely connected variables may constitute a set of composed interpretable measures. So, the research questions I am concerned with are formulated as:
canada = read_sav(path)
canada <- canada %>%
select("BSBM17A", "BSBM17B", "BSBM17C", "BSBM17D", "BSBM17E", "BSBM17F", "BSBM17G", "BSBM17H", "BSBM17I", "BSBM18A", "BSBM18B", "BSBM18C", "BSBM18D", "BSBM18E", "BSBM18F", "BSBM18G", "BSBM18H", "BSBM18I", "BSBM18J", "BSBM19A", "BSBM19B", "BSBM19C", "BSBM19D", "BSBM19E", "BSMMAT01")
The data used for this purpose is taken from the Trends in International Mathematics and Science Study (TIMSS) for Canada, 2015 (8th grade).In total, the dataset consists of 8,757 observations of 25 variables (24 would be used for the factor analysis, and 1 is the dependent variable in the future regression model). Nevertheless, each variable used for the future analysis contains missing values (from 219 to 366 observations). To deal with it, such observations were removed: thus, the new dataset we consider further has 7,662 observations of 25 variables (to note, the dependent variable has no missing values).
canada <- na.omit(canada) %>%
as.data.frame()
The table below presents the original names of the 25 selected variables, their interpretable names (the actual renaming is done further), their meanings, and the mean values. To interpret the last column, the possible values for each observation varies from 1 (agree a lot) to 4 (disagree a lot). The variable of the students’ achievements would be described further in more details.
variables = c("BSBM17A", "BSBM17B", "BSBM17C", "BSBM17D", "BSBM17E", "BSBM17F", "BSBM17G", "BSBM17H", "BSBM17I", "BSBM18A", "BSBM18B", "BSBM18C", "BSBM18D", "BSBM18E", "BSBM18F", "BSBM18G", "BSBM18H", "BSBM18I", "BSBM18J", "BSBM19A", "BSBM19B", "BSBM19C", "BSBM19D", "BSBM19E", "BSMMAT01")
statements = c("I enjoy learning mathematics", "I wish I did not have to study mathematics", "Mathematics is boring", "I learn many interesting things in mathematics", "I like mathematics", "I like any schoolwork that involves numbers", "I like to solve mathematics problems", "I look forward to mathematics class", "Mathematics is one of my favorite subjects", "I know what my teacher expects me to do", "My teacher is easy to understand", "I am interested in what my teacher says", "My teacher gives me interesting things to do", "My teacher has clear answers to my questions", "My teacher is good at explaining mathematics", "My teacher lets me show what I have learned", "My teacher does a variety of things to help us learn", "My teacher tells me how to do better when I make a mistake", "My teacher listens to what I have to say", "I usually do well in mathematics", "Mathematics is more difficult for me than for many of my classmates", "Mathematics is not one of my strengths", "I learn things quickly in mathematics", "Mathematics makes me nervous", "Student's math achievement")
mean <- c(round(mean(canada$BSBM17A), digits = 2), round(mean(canada$BSBM17B), digits = 2), round(mean(canada$BSBM17C), digits = 2), round(mean(canada$BSBM17D), digits = 2), round(mean(canada$BSBM17E), digits = 2), round(mean(canada$BSBM17F), digits = 2), round(mean(canada$BSBM17G), digits = 2), round(mean(canada$BSBM17H), digits = 2), round(mean(canada$BSBM17I), digits = 2), round(mean(canada$BSBM18A), digits = 2), round(mean(canada$BSBM18B), digits = 2), round(mean(canada$BSBM18C), digits = 2), round(mean(canada$BSBM18D), digits = 2), round(mean(canada$BSBM18E), digits = 2), round(mean(canada$BSBM18F), digits = 2), round(mean(canada$BSBM18G), digits = 2), round(mean(canada$BSBM18H), digits = 2), round(mean(canada$BSBM18I), digits = 2), round(mean(canada$BSBM18J), digits = 2), round(mean(canada$BSBM19A), digits = 2), round(mean(canada$BSBM19B), digits = 2), round(mean(canada$BSBM19C), digits = 2), round(mean(canada$BSBM19D), digits = 2), round(mean(canada$BSBM19E), digits = 2), round(mean(canada$BSMMAT01), digits = 2))
renamed <- c("enjoy_learning_math", "wish_not_to_study", "math_is_boring", "learning_interesting_things", "like_math", "like_schoolwork_with_numbers", "like_solving_maths", "look_forward_for_classes", "one_of_favorite_classes", "know_teachers_expectations", "teacher_is_understandable", "interested_in_teachers_words", "teacher_gives_interesting_things", "teacher_has_clear_answers", "teacher_is_good_at_explaning", "teacher_ask_to_show_knowledge", "teacher_does_a_lot_to_learn", "teacher_tells_how_to_do_better", "teacher_listens_to_me", "i_usually_do_well", "maths_is_harder_for_me_than_others", "maths_is_not_my_strength", "i_learn_quickly", "maths_makes_me_nervous", "math_achievement")
data.frame(variables, renamed, statements, mean, stringsAsFactors = FALSE) %>% kable() %>% kable_styling(bootstrap_options=c("bordered", "responsive","striped"), full_width = FALSE)
| variables | renamed | statements | mean |
|---|---|---|---|
| BSBM17A | enjoy_learning_math | I enjoy learning mathematics | 2.03 |
| BSBM17B | wish_not_to_study | I wish I did not have to study mathematics | 2.74 |
| BSBM17C | math_is_boring | Mathematics is boring | 2.59 |
| BSBM17D | learning_interesting_things | I learn many interesting things in mathematics | 2.02 |
| BSBM17E | like_math | I like mathematics | 2.14 |
| BSBM17F | like_schoolwork_with_numbers | I like any schoolwork that involves numbers | 2.42 |
| BSBM17G | like_solving_maths | I like to solve mathematics problems | 2.33 |
| BSBM17H | look_forward_for_classes | I look forward to mathematics class | 2.58 |
| BSBM17I | one_of_favorite_classes | Mathematics is one of my favorite subjects | 2.50 |
| BSBM18A | know_teachers_expectations | I know what my teacher expects me to do | 1.56 |
| BSBM18B | teacher_is_understandable | My teacher is easy to understand | 1.74 |
| BSBM18C | interested_in_teachers_words | I am interested in what my teacher says | 1.91 |
| BSBM18D | teacher_gives_interesting_things | My teacher gives me interesting things to do | 2.09 |
| BSBM18E | teacher_has_clear_answers | My teacher has clear answers to my questions | 1.78 |
| BSBM18F | teacher_is_good_at_explaning | My teacher is good at explaining mathematics | 1.65 |
| BSBM18G | teacher_ask_to_show_knowledge | My teacher lets me show what I have learned | 1.80 |
| BSBM18H | teacher_does_a_lot_to_learn | My teacher does a variety of things to help us learn | 1.67 |
| BSBM18I | teacher_tells_how_to_do_better | My teacher tells me how to do better when I make a mistake | 1.67 |
| BSBM18J | teacher_listens_to_me | My teacher listens to what I have to say | 1.62 |
| BSBM19A | i_usually_do_well | I usually do well in mathematics | 1.78 |
| BSBM19B | maths_is_harder_for_me_than_others | Mathematics is more difficult for me than for many of my classmates | 2.90 |
| BSBM19C | maths_is_not_my_strength | Mathematics is not one of my strengths | 2.76 |
| BSBM19D | i_learn_quickly | I learn things quickly in mathematics | 2.04 |
| BSBM19E | maths_makes_me_nervous | Mathematics makes me nervous | 2.81 |
| BSMMAT01 | math_achievement | Student’s math achievement | 537.14 |
Further, I examine each set of questions (BSBM17, BSBM18, and BSBM19) separately, in terms of its variables’ distributions. First, let’s look at the 17th block, the one that literally asks “How much do you agree with these statements about learning mathematics?”:
one <- ggplot(canada,
aes(as.numeric(BSBM17A))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I enjoy learning mathematics
",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black"))
two <- ggplot(canada,
aes(as.numeric(BSBM17B))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I wish I did not
have to study mathematics",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black"))
three <- ggplot(canada,
aes(as.numeric(BSBM17C))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "Mathematics is boring
",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black"))
four <- ggplot(canada,
aes(as.numeric(BSBM17D))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I learn many interesting things
in mathematics",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black"))
five <- ggplot(canada,
aes(as.numeric(BSBM17E))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I like mathematics
",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black"))
six <- ggplot(canada,
aes(as.numeric(BSBM17F))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I like any schoolwork
that involves numbers",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black"))
seven <- ggplot(canada,
aes(as.numeric(BSBM17G))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I like to solve
mathematics problems",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black"))
eight <- ggplot(canada,
aes(as.numeric(BSBM17H))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I look forward to
mathematics class",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black"))
nine <- ggplot(canada,
aes(as.numeric(BSBM17I))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "Mathematics is one of
my favorite subjects",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black"))
plot_grid(one, two, three, four, five, six, seven, eight, nine)
The 18th block (A-J) meets the question “How much do you agree with these statements about your mathematics lessons?”:
one <- ggplot(canada,
aes(as.numeric(BSBM18A))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I know what my teacher expects me to do",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
two <- ggplot(canada,
aes(as.numeric(BSBM18B))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "My teacher is easy to understand",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
three <- ggplot(canada,
aes(as.numeric(BSBM18C))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I am interested in what my teacher says",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
four <- ggplot(canada,
aes(as.numeric(BSBM18D))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "My teacher gives me interesting things to do",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
five <- ggplot(canada,
aes(as.numeric(BSBM18E))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "My teacher has clear answers to my questions",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
six <- ggplot(canada,
aes(as.numeric(BSBM18F))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "My teacher is good at explaining mathematics",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
seven <- ggplot(canada,
aes(as.numeric(BSBM18G))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "My teacher lets me show what I have learned",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
eight <- ggplot(canada,
aes(as.numeric(BSBM18H))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "My teacher does a variety of things to help us learn",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
nine <- ggplot(canada,
aes(as.numeric(BSBM18I))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "My teacher tells me how to do better when I make a mistake",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
ten <- ggplot(canada,
aes(as.numeric(BSBM18J))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "My teacher listens to what I have to say",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
plot_grid(one, two, three, four, five, six, seven, eight, nine, ten, ncol = 2)
The last (19th) block is united under the question “How much do you agree with these statements about mathematics?”.
one <- ggplot(canada,
aes(as.numeric(BSBM19A))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I usually
do well in mathematics",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
two <- ggplot(canada,
aes(as.numeric(BSBM19B))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "Maths is more difficult for me
than for many of my classmates",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
three <- ggplot(canada,
aes(as.numeric(BSBM19C))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "Mathematics is not
one of my strengths",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
four <- ggplot(canada,
aes(as.numeric(BSBM19D))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "I learn things
quickly in mathematics",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
five <- ggplot(canada,
aes(as.numeric(BSBM19E))) +
geom_bar(fill = "#60AB9A") +
theme_bw() +
labs(subtitle = "Mathematics makes me nervous
",
x = "",
y = "") +
theme(plot.subtitle=element_text(size=8, hjust=0.5, face="italic", color="black")) +
theme(axis.text=element_text(size=6))
plot_grid(one, two, three, four, five, nrow = 2)
The distribution of the dependent variable that interests us mostly:
canada %>%
ggplot(aes(as.numeric(BSMMAT01))) +
geom_histogram(bins = 40, fill = "#60AB9A") +
theme_bw() +
labs(x = "
student's math achievements",
y = "",
title = "The distribution of the math achievements across the data,",
subtitle = "dash line shows the variable mean = 537.14
") +
geom_vline(aes(xintercept = 537.14), color = "red", linetype = "longdash", size = 1.1)
Finally, I create 2 new sets of data: for the factor analysis and the regression model respectively, preliminary renaming them:
canada1 <- rename(canada,
##first block of variables
enjoy_learning_math = BSBM17A,
wish_not_to_study = BSBM17B,
math_is_boring = BSBM17C,
learning_interesting_things = BSBM17D,
like_math = BSBM17E,
like_schoolwork_with_numbers = BSBM17F,
like_solving_maths = BSBM17G,
look_forward_for_classes = BSBM17H,
one_of_favorite_classes = BSBM17I,
##second block of variables
know_teachers_expectations = BSBM18A,
teacher_is_understandable = BSBM18B,
interested_in_teachers_words = BSBM18C,
teacher_gives_interesting_things = BSBM18D,
teacher_has_clear_answers = BSBM18E,
teacher_is_good_at_explaning = BSBM18F,
teacher_ask_to_show_knowledge = BSBM18G,
teacher_does_a_lot_to_learn = BSBM18H,
teacher_tells_how_to_do_better = BSBM18I,
teacher_listens_to_me = BSBM18J,
##third block of variables
i_usually_do_well = BSBM19A,
maths_is_harder_for_me_than_others = BSBM19B,
maths_is_not_my_strength = BSBM19C,
i_learn_quickly = BSBM19D,
maths_makes_me_nervous = BSBM19E,
##dependent variable
math_achievement = BSMMAT01)
canada_reggression <- canada1
canada_factors <- canada1[,1:24]
Before running the exploratory factor analysis, we need to consider the correlation matrix - in this way, it is possible to make a first suggestion about the existing latent factors. To make this picture clearer, I changed the variables’ names (back again!) so that they contain only the number of question (17-19) and the reference to their respective parts (letters).
canada <- as.data.frame(lapply(canada, as.numeric))
canada <- as.data.frame(lapply(canada, as.numeric))
names(canada) = gsub(pattern = "BSBM", replacement = " ", x = names(canada))
names(canada) = gsub(pattern = "X.", replacement = "", x = names(canada))
hetcor <- hetcor(canada)
cor <- round(hetcor$correlations, 1)
ggcorrplot(cor, hc.order = TRUE,
type = "lower",
outline.color = "black",
ggtheme = ggplot2::theme_bw,
colors =c("#E74D4D", "white", "#9ECE9A"),
lab = TRUE,
lab_col = "black",
lab_size = 2,
tl.cex = 8,
title = "Correlation matrix of the selected variables",
legend.title = "Coefficients:
")
The observations based on this matrix are:
The parallel analysis is a technique to define the largest possible number of the latent clusters. I conduct the analysis with 500 iterations.
fa.parallel(hetcor$correlations, 500, fa = "fa")
## Parallel analysis suggests that the number of factors = 3 and the number of components = NA
The parallel analysis suggests that 3 factors should be derived. Considering the plot, the blue line crosses the horizontal line at 3 thus approaching the dashed line. So, I would use 3 factors while choosing the best factor analysis parameters, rotation (none vs. oblimin vs. varimax) and factoring methods (minres vs. wls vs. ml).
Before comparing the results of the different analysis setups, I assure that the variables are numeric. Next, I run the 9 analysis and analyze their results in the table below:
canada_factors <-as.data.frame(lapply(canada_factors, as.numeric))
## creating FAs
none_minres <- fa(canada_factors,
nfactors = 3,
rotate = "none",
fm = "minres",
cor = "mixed")
none_wls <- fa(canada_factors,
nfactors = 3,
rotate = "none",
fm = "wls",
cor = "mixed")
none_ml <- fa(canada_factors,
nfactors = 3,
rotate = "none",
fm = "ml",
cor = "mixed")
oblimin_minres <- fa(canada_factors,
nfactors = 3,
rotate = "oblimin",
fm = "minres",
cor = "mixed")
oblimin_wls <- fa(canada_factors,
nfactors = 3,
rotate = "oblimin",
fm = "wls",
cor = "mixed")
oblimin_ml <- fa(canada_factors,
nfactors = 3,
rotate = "oblimin",
fm = "ml",
cor = "mixed")
varimax_minres <- fa(canada_factors,
nfactors = 3,
rotate = "varimax",
fm = "minres",
cor = "mixed")
varimax_wls <- fa(canada_factors,
nfactors = 3,
rotate = "varimax",
fm = "wls",
cor = "mixed")
varimax_ml <- fa(canada_factors,
nfactors = 3,
rotate = "varimax",
fm = "ml",
cor = "mixed")
## making a table through data.frame
parameters <- c("none_minres", "none_wls", "none_ml", "oblimin_minres", "oblimin_wls", "oblimin_ml", "varimax_minres", "varimax_wls", "varimax_ml")
RMSR <- c(round(none_minres$rms, digits = 2), round(none_wls$rms, digits = 2), round(none_ml$rms, digits = 2), round(oblimin_minres$rms, digits = 2), round(oblimin_wls$rms, digits = 2), round(oblimin_ml$rms, digits = 2), round(varimax_minres$rms, digits = 2), round(varimax_wls$rms, digits = 2), round(varimax_ml$rms, digits = 2))
RMSEA <- c(round(none_minres$RMSEA, digits = 2), round(none_wls$RMSEA, digits = 2), round(none_ml$RMSEA, digits = 2), round(oblimin_minres$RMSEA, digits = 2), round(oblimin_wls$RMSEA, digits = 2), round(oblimin_ml$RMSEA, digits = 2), round(varimax_minres$RMSEA, digits = 2), round(varimax_wls$RMSEA, digits = 2), round(varimax_ml$RMSEA, digits = 2))
Tucker_Lewis_index <- c(round(none_minres$TLI, digits = 2), round(none_wls$TLI, digits = 2), round(none_ml$TLI, digits = 2), round(oblimin_minres$TLI, digits = 2), round(oblimin_wls$TLI, digits = 2), round(oblimin_ml$TLI, digits = 2), round(varimax_minres$TLI, digits = 2), round(varimax_wls$TLI, digits = 2), round(varimax_ml$TLI, digits = 2))
BIC <- c(none_minres$BIC, none_wls$BIC, none_ml$BIC, oblimin_minres$BIC, oblimin_wls$BIC, oblimin_ml$BIC, varimax_minres$BIC, varimax_wls$BIC, varimax_ml$BIC)
data.frame(parameters,
RMSR,
RMSEA,
Tucker_Lewis_index,
BIC,
stringsAsFactors = FALSE) %>%
filter(RMSEA < 0.1) %>%
count(parameters, RMSR, RMSEA, Tucker_Lewis_index, BIC) %>%
select(!n) %>%
kable() %>%
kable_styling(bootstrap_options=c("bordered",
"responsive",
"striped"),
full_width = FALSE)
| parameters | RMSR | RMSEA | Tucker_Lewis_index | BIC |
|---|---|---|---|---|
| none_minres | 0.02 | 0.08 | 0.93 | 8120.733 |
| none_ml | 0.02 | 0.08 | 0.93 | 7919.623 |
| none_wls | 0.02 | 0.08 | 0.93 | 8045.583 |
| oblimin_minres | 0.02 | 0.08 | 0.93 | 8120.733 |
| oblimin_ml | 0.02 | 0.08 | 0.93 | 7919.623 |
| oblimin_wls | 0.02 | 0.08 | 0.93 | 8045.583 |
| varimax_minres | 0.02 | 0.08 | 0.93 | 8120.733 |
| varimax_ml | 0.02 | 0.08 | 0.93 | 7919.623 |
| varimax_wls | 0.02 | 0.08 | 0.93 | 8045.583 |
As seen, the values of RMSR = 0.02 < 0.05, RMSEA = 0.08 > 0.05, and TLI = 0.93 > 0.9 are similar among the variatons. The only difference lies in the BIC values: the models with ml factoring method have the lowest ones. Generally speaking, the RMSEA is the only sufficient (not “good”) fit, but still it could be said that the derived factors are fine!
Another step is to compare the loadings. So, no rotation loadings:
print(none_ml$loadings,cutoff = 0.4)
##
## Loadings:
## ML1 ML2 ML3
## enjoy_learning_math 0.901
## wish_not_to_study -0.679
## math_is_boring -0.738
## learning_interesting_things 0.757
## like_math 0.909
## like_schoolwork_with_numbers 0.783
## like_solving_maths 0.825
## look_forward_for_classes 0.861
## one_of_favorite_classes 0.881
## know_teachers_expectations 0.551 0.418
## teacher_is_understandable 0.630 0.591
## interested_in_teachers_words 0.712 0.433
## teacher_gives_interesting_things 0.690 0.440
## teacher_has_clear_answers 0.590 0.643
## teacher_is_good_at_explaning 0.635 0.634
## teacher_ask_to_show_knowledge 0.555 0.532
## teacher_does_a_lot_to_learn 0.557 0.614
## teacher_tells_how_to_do_better 0.522 0.630
## teacher_listens_to_me 0.513 0.627
## i_usually_do_well 0.730
## maths_is_harder_for_me_than_others -0.599 0.525
## maths_is_not_my_strength -0.694 0.415 0.429
## i_learn_quickly 0.731
## maths_makes_me_nervous -0.475
##
## ML1 ML2 ML3
## SS loadings 11.752 4.011 1.253
## Proportion Var 0.490 0.167 0.052
## Cumulative Var 0.490 0.657 0.709
…oblimin rotation loadings:
print(oblimin_ml$loadings,cutoff = 0.4)
##
## Loadings:
## ML1 ML2 ML3
## enjoy_learning_math 0.901
## wish_not_to_study -0.601
## math_is_boring -0.753
## learning_interesting_things 0.784
## like_math 0.913
## like_schoolwork_with_numbers 0.836
## like_solving_maths 0.822
## look_forward_for_classes 0.867
## one_of_favorite_classes 0.820
## know_teachers_expectations 0.662
## teacher_is_understandable 0.879
## interested_in_teachers_words 0.403 0.613
## teacher_gives_interesting_things 0.406 0.606
## teacher_has_clear_answers 0.907
## teacher_is_good_at_explaning 0.913
## teacher_ask_to_show_knowledge 0.748
## teacher_does_a_lot_to_learn 0.815
## teacher_tells_how_to_do_better 0.827
## teacher_listens_to_me 0.830
## i_usually_do_well -0.731
## maths_is_harder_for_me_than_others 0.906
## maths_is_not_my_strength 0.826
## i_learn_quickly -0.638
## maths_makes_me_nervous 0.623
##
## ML1 ML2 ML3
## SS loadings 6.431 6.287 3.047
## Proportion Var 0.268 0.262 0.127
## Cumulative Var 0.268 0.530 0.657
…varimax rotation loadings:
print(varimax_ml$loadings,cutoff = 0.4)
##
## Loadings:
## ML2 ML1 ML3
## enjoy_learning_math 0.834
## wish_not_to_study -0.592
## math_is_boring -0.692
## learning_interesting_things 0.693
## like_math 0.849
## like_schoolwork_with_numbers 0.754
## like_solving_maths 0.768
## look_forward_for_classes 0.789
## one_of_favorite_classes 0.791 0.426
## know_teachers_expectations 0.652
## teacher_is_understandable 0.845
## interested_in_teachers_words 0.708 0.449
## teacher_gives_interesting_things 0.701 0.441
## teacher_has_clear_answers 0.865
## teacher_is_good_at_explaning 0.880
## teacher_ask_to_show_knowledge 0.742
## teacher_does_a_lot_to_learn 0.806
## teacher_tells_how_to_do_better 0.805
## teacher_listens_to_me 0.801
## i_usually_do_well 0.751
## maths_is_harder_for_me_than_others -0.847
## maths_is_not_my_strength -0.827
## i_learn_quickly 0.430 0.687
## maths_makes_me_nervous -0.602
##
## ML2 ML1 ML3
## SS loadings 6.810 6.361 3.845
## Proportion Var 0.284 0.265 0.160
## Cumulative Var 0.284 0.549 0.709
To start with, we should not consider the baseline model without the rotation as its first factor includes all the variables thus making SS loadings highly unequal. Next, comparing varimax and oblimin kinds of rotation, I would choose the former because its mean item complexity = 1.4 is better than the same value for the last (1.1 < 1.3). On the other hand, it should be said that the last model has less intersections between the variables constituting the factors.
Though, given that the proportions of SS loadings, proportion var, and cumulative var are approximately similar, the varimax composition was chosen. To finish this discussion, according to the Kaiser’s rule, it is legit to keep the factors: the SS loadings (eigenvalues) are > 1 for each factor.
The total output of the chosen method:
varimax_ml
## Factor Analysis using method = ml
## Call: fa(r = canada_factors, nfactors = 3, rotate = "varimax", fm = "ml",
## cor = "mixed")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML1 ML3 h2 u2 com
## enjoy_learning_math 0.26 0.83 0.34 0.88 0.118 1.5
## wish_not_to_study -0.18 -0.59 -0.33 0.49 0.505 1.8
## math_is_boring -0.20 -0.69 -0.28 0.60 0.401 1.5
## learning_interesting_things 0.39 0.69 0.11 0.65 0.352 1.6
## like_math 0.22 0.85 0.39 0.92 0.083 1.5
## like_schoolwork_with_numbers 0.20 0.75 0.28 0.69 0.313 1.4
## like_solving_maths 0.19 0.77 0.36 0.76 0.243 1.6
## look_forward_for_classes 0.34 0.79 0.24 0.80 0.202 1.6
## one_of_favorite_classes 0.21 0.79 0.43 0.85 0.150 1.7
## know_teachers_expectations 0.65 0.19 0.19 0.50 0.503 1.3
## teacher_is_understandable 0.84 0.17 0.19 0.78 0.224 1.2
## interested_in_teachers_words 0.71 0.45 0.06 0.71 0.292 1.7
## teacher_gives_interesting_things 0.70 0.44 0.04 0.69 0.313 1.7
## teacher_has_clear_answers 0.86 0.14 0.12 0.78 0.217 1.1
## teacher_is_good_at_explaning 0.88 0.17 0.15 0.83 0.174 1.1
## teacher_ask_to_show_knowledge 0.74 0.19 0.09 0.60 0.404 1.2
## teacher_does_a_lot_to_learn 0.81 0.19 0.03 0.69 0.312 1.1
## teacher_tells_how_to_do_better 0.80 0.15 0.02 0.67 0.329 1.1
## teacher_listens_to_me 0.80 0.14 0.03 0.66 0.339 1.1
## i_usually_do_well 0.19 0.39 0.75 0.75 0.245 1.7
## maths_is_harder_for_me_than_others -0.05 -0.25 -0.85 0.78 0.217 1.2
## maths_is_not_my_strength -0.06 -0.39 -0.83 0.84 0.162 1.4
## i_learn_quickly 0.20 0.43 0.69 0.70 0.304 1.9
## maths_makes_me_nervous -0.06 -0.23 -0.60 0.42 0.582 1.3
##
## ML2 ML1 ML3
## SS loadings 6.81 6.36 3.84
## Proportion Var 0.28 0.27 0.16
## Cumulative Var 0.28 0.55 0.71
## Proportion Explained 0.40 0.37 0.23
## Cumulative Proportion 0.40 0.77 1.00
##
## Mean item complexity = 1.4
## Test of the hypothesis that 3 factors are sufficient.
##
## The degrees of freedom for the null model are 276 and the objective function was 24.46 with Chi Square of 187201.9
## The degrees of freedom for the model are 207 and the objective function was 1.28
##
## The root mean square of the residuals (RMSR) is 0.02
## The df corrected root mean square of the residuals is 0.02
##
## The harmonic number of observations is 7662 with the empirical chi square 1815.99 with prob < 8.5e-255
## The total number of observations was 7662 with Likelihood Chi Square = 9771.04 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.932
## RMSEA index = 0.078 and the 90 % confidence intervals are 0.076 0.079
## BIC = 7919.62
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## ML2 ML1 ML3
## Correlation of (regression) scores with factors 0.97 0.96 0.94
## Multiple R square of scores with factors 0.95 0.93 0.89
## Minimum correlation of possible factor scores 0.90 0.86 0.78
The amount of the considered variables results in the awkward diagram presented below:
fa.diagram(varimax_ml, simple = T)
Though the graph is hard to look at, it is obvious that the factors strictly corresponds to the blocks of questions discussed above. Thus, the ML2 is related to the 18th block (10 variables, or 10 questions), the ML1 is related to the 17th block (9 variables, or 9 questions), and the ML3 is related to the 19th block (5 variables, or 5 questions). I would call these factors according to the blocks’ names:
1. learning maths (ML1) stands for the student’s attitudes towards the process of learning mathematics. It includes joy and the consideration of math as a lesson.
2. teacher’s role (ML2) stands for the student’s attitudes towards the teacher’s performance. It consists both of how the teacher studies and how does he or she interact with the pupils.
3. maths itself (ML3) stands for the math perception as the strong or the weak feature of the student.
In addition, a good indicator of the factors’ high quality is the number of variables included > 3.
In this part, I check the model fit using the Chronbach’s alpha: if this value for the factor is > 0.7, than the reliability is good.
ML1 <- canada_factors[, 1:9]
alpha(ML1, check.keys = T)
##
## Reliability analysis
## Call: alpha(x = ML1, check.keys = T)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.94 0.94 0.94 0.64 16 0.001 2.3 0.82 0.64
##
## lower alpha upper 95% confidence boundaries
## 0.94 0.94 0.94
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se
## enjoy_learning_math 0.93 0.93 0.93 0.62 13 0.0012
## wish_not_to_study- 0.94 0.94 0.94 0.66 16 0.0010
## math_is_boring- 0.93 0.94 0.93 0.65 15 0.0011
## learning_interesting_things 0.94 0.94 0.94 0.65 15 0.0011
## like_math 0.93 0.93 0.92 0.61 13 0.0013
## like_schoolwork_with_numbers 0.93 0.93 0.93 0.64 14 0.0011
## like_solving_maths 0.93 0.93 0.93 0.63 14 0.0012
## look_forward_for_classes 0.93 0.93 0.93 0.63 13 0.0012
## one_of_favorite_classes 0.93 0.93 0.93 0.62 13 0.0012
## var.r med.r
## enjoy_learning_math 0.0082 0.61
## wish_not_to_study- 0.0065 0.66
## math_is_boring- 0.0100 0.66
## learning_interesting_things 0.0083 0.66
## like_math 0.0072 0.61
## like_schoolwork_with_numbers 0.0092 0.63
## like_solving_maths 0.0091 0.63
## look_forward_for_classes 0.0094 0.63
## one_of_favorite_classes 0.0083 0.62
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## enjoy_learning_math 7662 0.88 0.89 0.88 0.85 2.0 0.93
## wish_not_to_study- 7662 0.72 0.71 0.65 0.63 2.3 1.07
## math_is_boring- 7662 0.78 0.78 0.74 0.72 2.4 0.99
## learning_interesting_things 7662 0.74 0.75 0.70 0.68 2.0 0.87
## like_math 7662 0.90 0.90 0.90 0.87 2.1 1.00
## like_schoolwork_with_numbers 7662 0.81 0.81 0.78 0.76 2.4 0.93
## like_solving_maths 7662 0.85 0.85 0.83 0.80 2.3 1.00
## look_forward_for_classes 7662 0.85 0.85 0.83 0.81 2.6 0.99
## one_of_favorite_classes 7662 0.87 0.86 0.85 0.82 2.5 1.14
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## enjoy_learning_math 0.33 0.42 0.16 0.09 0
## wish_not_to_study 0.17 0.23 0.29 0.31 0
## math_is_boring 0.16 0.30 0.33 0.21 0
## learning_interesting_things 0.30 0.44 0.19 0.06 0
## like_math 0.31 0.38 0.18 0.13 0
## like_schoolwork_with_numbers 0.17 0.37 0.32 0.14 0
## like_solving_maths 0.24 0.35 0.26 0.16 0
## look_forward_for_classes 0.16 0.30 0.34 0.20 0
## one_of_favorite_classes 0.27 0.23 0.24 0.26 0
The alpha of the learning math factor shows an excellent fit as it is > 0.9.
ML2 <- canada_factors[, 10:19]
alpha(ML2, check.keys = T)
##
## Reliability analysis
## Call: alpha(x = ML2, check.keys = T)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.93 0.93 0.93 0.57 13 0.0012 1.7 0.64 0.57
##
## lower alpha upper 95% confidence boundaries
## 0.93 0.93 0.93
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N
## know_teachers_expectations 0.93 0.93 0.93 0.59 13
## teacher_is_understandable 0.92 0.92 0.92 0.56 11
## interested_in_teachers_words 0.92 0.92 0.92 0.57 12
## teacher_gives_interesting_things 0.92 0.92 0.92 0.57 12
## teacher_has_clear_answers 0.92 0.92 0.92 0.56 11
## teacher_is_good_at_explaning 0.92 0.92 0.92 0.55 11
## teacher_ask_to_show_knowledge 0.92 0.92 0.92 0.57 12
## teacher_does_a_lot_to_learn 0.92 0.92 0.92 0.56 12
## teacher_tells_how_to_do_better 0.92 0.92 0.92 0.57 12
## teacher_listens_to_me 0.92 0.92 0.92 0.57 12
## alpha se var.r med.r
## know_teachers_expectations 0.0012 0.0036 0.58
## teacher_is_understandable 0.0014 0.0048 0.57
## interested_in_teachers_words 0.0013 0.0054 0.57
## teacher_gives_interesting_things 0.0013 0.0054 0.57
## teacher_has_clear_answers 0.0014 0.0047 0.56
## teacher_is_good_at_explaning 0.0014 0.0040 0.56
## teacher_ask_to_show_knowledge 0.0013 0.0061 0.57
## teacher_does_a_lot_to_learn 0.0013 0.0057 0.56
## teacher_tells_how_to_do_better 0.0013 0.0057 0.57
## teacher_listens_to_me 0.0013 0.0057 0.57
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## know_teachers_expectations 7662 0.66 0.68 0.62 0.60 1.6 0.68
## teacher_is_understandable 7662 0.83 0.82 0.81 0.78 1.7 0.85
## interested_in_teachers_words 7662 0.78 0.77 0.75 0.71 1.9 0.83
## teacher_gives_interesting_things 7662 0.77 0.77 0.74 0.71 2.1 0.89
## teacher_has_clear_answers 7662 0.83 0.83 0.81 0.78 1.8 0.86
## teacher_is_good_at_explaning 7662 0.84 0.84 0.83 0.79 1.6 0.83
## teacher_ask_to_show_knowledge 7662 0.75 0.76 0.72 0.69 1.8 0.80
## teacher_does_a_lot_to_learn 7662 0.79 0.79 0.76 0.74 1.7 0.81
## teacher_tells_how_to_do_better 7662 0.78 0.78 0.75 0.72 1.7 0.81
## teacher_listens_to_me 7662 0.77 0.77 0.74 0.71 1.6 0.79
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## know_teachers_expectations 0.53 0.39 0.06 0.02 0
## teacher_is_understandable 0.48 0.35 0.12 0.05 0
## interested_in_teachers_words 0.35 0.45 0.16 0.05 0
## teacher_gives_interesting_things 0.28 0.42 0.23 0.07 0
## teacher_has_clear_answers 0.46 0.36 0.13 0.05 0
## teacher_is_good_at_explaning 0.54 0.32 0.10 0.05 0
## teacher_ask_to_show_knowledge 0.40 0.43 0.13 0.04 0
## teacher_does_a_lot_to_learn 0.51 0.34 0.10 0.04 0
## teacher_tells_how_to_do_better 0.51 0.35 0.10 0.04 0
## teacher_listens_to_me 0.54 0.34 0.08 0.04 0
The same is for the second factor: still > 0.9.
ML3 <- canada_factors[, 20:24]
alpha(ML3, check.keys = T)
##
## Reliability analysis
## Call: alpha(x = ML3, check.keys = T)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.88 0.88 0.86 0.59 7.2 0.0022 2.9 0.81 0.61
##
## lower alpha upper 95% confidence boundaries
## 0.87 0.88 0.88
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N
## i_usually_do_well- 0.84 0.84 0.81 0.57 5.4
## maths_is_harder_for_me_than_others 0.83 0.84 0.81 0.57 5.2
## maths_is_not_my_strength 0.83 0.83 0.80 0.55 4.9
## i_learn_quickly- 0.85 0.85 0.82 0.59 5.7
## maths_makes_me_nervous 0.88 0.89 0.86 0.66 7.9
## alpha se var.r med.r
## i_usually_do_well- 0.0029 0.0117 0.56
## maths_is_harder_for_me_than_others 0.0031 0.0144 0.58
## maths_is_not_my_strength 0.0032 0.0121 0.56
## i_learn_quickly- 0.0028 0.0128 0.58
## maths_makes_me_nervous 0.0021 0.0026 0.65
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## i_usually_do_well- 7662 0.83 0.84 0.80 0.74 3.2 0.86
## maths_is_harder_for_me_than_others 7662 0.86 0.85 0.81 0.76 2.9 1.03
## maths_is_not_my_strength 7662 0.88 0.87 0.84 0.79 2.8 1.11
## i_learn_quickly- 7662 0.81 0.82 0.77 0.71 3.0 0.93
## maths_makes_me_nervous 7662 0.72 0.71 0.58 0.55 2.8 1.01
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## i_usually_do_well 0.45 0.36 0.14 0.05 0
## maths_is_harder_for_me_than_others 0.13 0.20 0.32 0.35 0
## maths_is_not_my_strength 0.19 0.20 0.26 0.34 0
## i_learn_quickly 0.34 0.37 0.22 0.08 0
## maths_makes_me_nervous 0.12 0.25 0.32 0.31 0
For the last factor, maths itself, the alpha is a bit lower but it indicates that the fit is still good (> 0.8).
In the last part, I build 2 multiple linear regression models to predict the students’ values of maths achievements (math_achievement, or BSMMAT01). The first model uses the factors derived previously as the only independent variables, while the second one controls for (1) gender - BSBG01, (2) parental education, operationalized by me as the father’s highest education - BSBG07B, and (3) the fact of being born in the country - BSBG10A.
# 1st model
scores <- (varimax_ml$scores)
canada_factors1 <- cbind(canada_reggression,scores)
canada_factors1 <- canada_factors1 %>%
rename(
learning_maths = ML1,
teachers_role = ML2,
maths_itself = ML3
)
model1 <- lm(math_achievement ~ teachers_role + learning_maths + maths_itself, canada_factors1)
# 2nd model
canada = read_sav(path)
canada <- canada %>% select(
"BSBM17A", "BSBM17B", "BSBM17C", "BSBM17D", "BSBM17E", "BSBM17F", "BSBM17G", "BSBM17H", "BSBM17I", "BSBM18A", "BSBM18B", "BSBM18C", "BSBM18D", "BSBM18E", "BSBM18F", "BSBM18G", "BSBM18H", "BSBM18I", "BSBM18J", "BSBM19A", "BSBM19B", "BSBM19C", "BSBM19D", "BSBM19E", "BSMMAT01", "BSBG01", "BSBG07B", "BSBG10A"
)
canada <- rename(canada,
##first block of variables
enjoy_learning_math = BSBM17A,
wish_not_to_study = BSBM17B,
math_is_boring = BSBM17C,
learning_interesting_things = BSBM17D,
like_math = BSBM17E,
like_schoolwork_with_numbers = BSBM17F,
like_solving_maths = BSBM17G,
look_forward_for_classes = BSBM17H,
one_of_favorite_classes = BSBM17I,
##second block of variables
know_teachers_expectations = BSBM18A,
teacher_is_understandable = BSBM18B,
interested_in_teachers_words = BSBM18C,
teacher_gives_interesting_things = BSBM18D,
teacher_has_clear_answers = BSBM18E,
teacher_is_good_at_explaning = BSBM18F,
teacher_ask_to_show_knowledge = BSBM18G,
teacher_does_a_lot_to_learn = BSBM18H,
teacher_tells_how_to_do_better = BSBM18I,
teacher_listens_to_me = BSBM18J,
##third block of variables
i_usually_do_well = BSBM19A,
maths_is_harder_for_me_than_others = BSBM19B,
maths_is_not_my_strength = BSBM19C,
i_learn_quickly = BSBM19D,
maths_makes_me_nervous = BSBM19E,
##dependent variable
math_achievement = BSMMAT01,
## control variables
gender = BSBG01,
fathers_education = BSBG07B,
place_of_birth = BSBG10A)
canada <- na.omit(canada) %>%
as.data.frame() %>%
left_join(canada_factors1)
canada$math_achievement = as.numeric(canada$math_achievement)
canada$gender = as.character(canada$gender)
canada$fathers_education = as.factor(canada$fathers_education)
canada$place_of_birth = as.character(canada$place_of_birth)
model2 <- lm(math_achievement ~ teachers_role + learning_maths + maths_itself + gender + fathers_education + place_of_birth, canada)
# table creation
tab_model(model1, model2, dv.labels = c("Factors only", "Factors with control for ..."), CSS = list(css.depvarhead = '+color: blue;'))
| Factors only | Factors with control for … | |||||
|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 537.14 | 535.87 – 538.41 | <0.001 | 519.61 | 504.90 – 534.33 | <0.001 |
| teachers_role | -0.39 | -1.62 – 0.85 | 0.542 | -0.54 | -1.75 – 0.66 | 0.378 |
| learning_maths | -10.45 | -11.76 – -9.13 | <0.001 | -9.10 | -10.39 – -7.80 | <0.001 |
| maths_itself | -36.60 | -37.92 – -35.28 | <0.001 | -34.13 | -35.44 – -32.81 | <0.001 |
| gender [2] | -2.43 | -4.92 – 0.06 | 0.056 | |||
| fathers_education [2] | 2.04 | -15.35 – 19.42 | 0.818 | |||
| fathers_education [3] | 9.78 | -5.31 – 24.86 | 0.204 | |||
| fathers_education [4] | 18.76 | 3.65 – 33.87 | 0.015 | |||
| fathers_education [5] | 20.23 | 4.89 – 35.57 | 0.010 | |||
| fathers_education [6] | 40.48 | 25.33 – 55.62 | <0.001 | |||
| fathers_education [7] | 43.13 | 28.06 – 58.21 | <0.001 | |||
| fathers_education [8] | 10.03 | -4.77 – 24.82 | 0.184 | |||
| place_of_birth [2] | -5.06 | -8.73 – -1.40 | 0.007 | |||
| Observations | 7662 | 7565 | ||||
| R2 / R2 adjusted | 0.313 / 0.312 | 0.349 / 0.348 | ||||
To start, the adjusted R^2 does not rise a lot when controlling for the 3 mentioned variables: it increases by 0.036. The factors learning maths (ML1) and maths itself (ML3) are statistically significant in both models. Next, I want to examine the second model in more details:
The residual values are distributed (sort of) normally:
hist(resid(model2),
xlab = "Residuals",
ylab = "",
main = "Second model: residuals' distribution",
col = "#60AB9A",
border = "black",
breaks = 20)
Checking for variance inflation factor (VIF) - no multicolinearity (values < 5):
vif(model2)
## GVIF Df GVIF^(1/(2*Df))
## teachers_role 1.008483 1 1.004232
## learning_maths 1.050537 1 1.024957
## maths_itself 1.057675 1 1.028433
## gender 1.012260 1 1.006111
## fathers_education 1.070114 7 1.004852
## place_of_birth 1.036728 1 1.018198
Residuals again:
Residuals vs. Fitted: no non-linear patters detected (red line is like horizontal). Normal Q-Q: residuals are normally distributed (the points are in line). Scale-Location: no problems with homoscedasticity (the red line is not strictly horizontal but it should be legit). Residuals vs. Leverage: there are no influential outliers (red line is horizontal).
par(mfrow = c(2,2))
plot(model2)
The most important result of this paper is that the variables of 3 questions (BSBM17-BSBM19) used in TIMSS 2015 may serve as the basis for the factors’ construction. Those which I got do consist of the variables from the blocks of questions, dedicated to the (1) learning of the maths, (2) teacher’s moderation role, and (3) attitudes towards maths as such. As the regression analysis has shown, the second factor does not contribute to the students’ maths achievements. The control for gender, parental educational level, and place of birth proved that the factors alone explain approximately 1/3 of the variation in the data.