Similarly to our previous project, both the country seleсtion and the writing of the core of the code was done collectively. Because of that the general structure of the report and factor analysis is the same in works of all the teammates. However, individually submitted files with detailed descriptions and analysis were written separately.
Team members:
For this project we have chosen Singapore as our country of interest. Since TIMSS is focused on the trends in mathematics and science education across the world we thought it would be interesting to take a look at the data from Singapore which is believed to have one of the most successful education systems in the world and has been consistently ranked at the top of both the OECD’s Programme for International Student Assessment (PISA) (in math, reading and science performance of the students) and TIMSS itself. Since students is Singapore demonstrate such excellent math skills, it would be ineteresting to investigate the variables that might be connected with their math achievement.
It is well known that in a very short period of time, after becoming a sovereign city-state, Singapore has reached an extremely high level of development. As a relatively small country that lacks natural resources, Singapore had to invest in human capital in order to move forward. Education of such a good quality can be seen as a result of such emphasis.
One of the main reasons behind such good quality of education recieved by students in Singapore is believed to be training and, later on, professional development of the teachers. Education is highly centralized in Singapore and is regulated by the National Institute of Education (NIE) and the Ministry of Education. The former, in collaboration with the latter, is responsible for identifying the number of openings for elementary and secondary school teaching positions and carrying out the selection process. The market when it comes to those positions is highly competitive for the following reasons: while teaching is culturally seen as a honored and desired profession (was even ranked higher than law and medical professions in a survey conducted in 2004), teachers are also kindly supported by the governemnt with opportunities for working part-time, career development in different spheres, high wages etc. Even when it comes to those candidates who have passed the preliminary stages of selection, it can be clearly seen that the government (here, Ministry of Education) is ready to fully support them only as long as they are demonstrating that those investments pay off and keep developing as professionals. For instance, during the induction period with the 80% workload (compared to full-time teachers) new teachers who are constantly observed and coached by higher-level professionals are allowed to try working in a different school if their performance in the one they are curently at is not good enough, but, in such case, if their performance is still not improving they will have to leave the profession. In case of the candidate training (that comes before that) it is the same: candidates are allowed to transfer to a different school but if they fail to meet the expectations they will have to spend another semester in the program and repay for the tuition and stipends they have recieved from the governemnt. Very few people actually fail but the way this system works shows that the Ministry of Education is willing to help those who want to pursue a career in teaching but also want these investments to pay off - this field is very competitve at the beggining and continues to be very demanding even if you get in. In the end, those candidates that actually end up becoming teachers are highly qualified. Teachers are also entitled to 100 hours of training in order to advance in their professional development and are constantly encouraged to become more and more proficient in their core competency while also obtaining new skills and knowledge to provide students with the most revelant up-to-date information.
Additionally, students are also actively participating in the professional development of the teachers by taking a part in such modes of assessment as journal writing, classroom observation and conferencing. By using those, students give teachers the essential feedback to tailor the studying program to the needs and skills of their students. That is both beneficial to students who, in the end, are taught in a way that is suposed to be better for them and teachers who learn about the efficiency of particular classroom practices and student’s preferences. Such guidelines were developed by the Ministry of education in Singapore (“Mathematics Assessment Guides”). Consistent feedback and collaboration is yet another factor behind the success of Singapore’s education system.
The importance of student’s perspective was also highlighted in the Kaur’s (2009) study where the main reserach question was about the particular characteristics of a good mathematics teacher identified by both teachers themsevles and their students. Three main segments were identified: Whole-class demonstration, Seatwork and Review & feedback. The items included in each of those are listed below:
Whole-class demonstration:
The teacher…
Seatwork:
The teacher…
Review and feedback:
The teacher…
Since one of the subsets of items from TIMMS is about student’s evaluation of their teachers I persoanlly think it would be interesting to see whether any of these characteristics are similar to those variables from TIMMS and whether students in Singapore gave their teachers high rankings on those indicating teacher’s proficiency and student satisfaction with it.
# Attaching the packages to work with
library(foreign)
library(ggplot2)
library(psych)
library(polycor)
library(sjPlot)
library(corrplot)
library(dplyr)
library(ggpubr)
library(car)
library(e1071)
# Loading the dataset
data1 <- read.spss("BSGSGPM6.sav", to.data.frame = TRUE, use.value.labels = TRUE)
# Chosing particular variables we are interested in
data2<-data1[c("BSBM17A", "BSBM17B", "BSBM17C", "BSBM17D", "BSBM17E", "BSBM17F", "BSBM17G", "BSBM17H", "BSBM17I", "BSBM18A", "BSBM18B", "BSBM18C", "BSBM18D", "BSBM18E", "BSBM18F", "BSBM18G", "BSBM18H", "BSBM18I", "BSBM18J", "BSBM19A", "BSBM19B", "BSBM19C", "BSBM19D", "BSBM19E", "BSMMAT01", "BSBG01", "BSBG07A", "BSBG07B", "BSBG10A")]
data3<-na.omit(data2)
# Subsetting variables for the factor analysis
save1<-c("BSBM17A", "BSBM17B", "BSBM17C", "BSBM17D", "BSBM17E", "BSBM17F", "BSBM17G", "BSBM17H", "BSBM17I", "BSBM18A", "BSBM18B", "BSBM18C", "BSBM18D", "BSBM18E", "BSBM18F", "BSBM18G", "BSBM18H", "BSBM18I", "BSBM18J", "BSBM19A", "BSBM19B", "BSBM19C", "BSBM19D", "BSBM19E")
data_fa <- data3[save1]
# Subsetting variables for the linear regression analysis
save2<-c("BSMMAT01", "BSBG01", "BSBG07A", "BSBG07B", "BSBG10A")
data_reg <- data3[save2]
The initial dataset contained 6116 observations of 436 variables, but only 29 variables were chosen to be used for this particular project. The number of observations decreased up to 5875 after the NAs were deleted from the dataset. Selected variables were divided into three groups in the TIMSS questionnarie. Though these groups were not named, their general themes seem to be: one’s attitude towards learning mathematics (variables ending with 17), one’s asessment of class activities and instructor’s teaching abilities (variables ending with 18), one’s self-assessment for their math skills (variables ending with 19). These groups of variables will be discussed later in the factor analysis.
For all of the variables a scale from 1 (“Agree a lot”) to 4 (“Disagree a lot”) was used which makes them ordinal.
str(data_fa)
## 'data.frame': 5875 obs. of 24 variables:
## $ BSBM17A: Factor w/ 4 levels "Agree a lot",..: 2 3 2 2 1 2 3 2 1 3 ...
## $ BSBM17B: Factor w/ 4 levels "Agree a lot",..: 3 2 2 3 3 3 2 2 4 1 ...
## $ BSBM17C: Factor w/ 4 levels "Agree a lot",..: 3 2 2 3 4 3 3 2 4 2 ...
## $ BSBM17D: Factor w/ 4 levels "Agree a lot",..: 2 1 2 2 2 3 2 2 1 3 ...
## $ BSBM17E: Factor w/ 4 levels "Agree a lot",..: 2 3 2 2 1 3 2 3 1 4 ...
## $ BSBM17F: Factor w/ 4 levels "Agree a lot",..: 2 4 3 2 1 3 3 3 2 4 ...
## $ BSBM17G: Factor w/ 4 levels "Agree a lot",..: 3 2 3 2 1 3 3 3 1 4 ...
## $ BSBM17H: Factor w/ 4 levels "Agree a lot",..: 2 4 4 2 1 3 2 3 4 4 ...
## $ BSBM17I: Factor w/ 4 levels "Agree a lot",..: 3 3 4 2 1 3 3 3 1 4 ...
## $ BSBM18A: Factor w/ 4 levels "Agree a lot",..: 1 4 2 2 1 3 1 2 1 2 ...
## $ BSBM18B: Factor w/ 4 levels "Agree a lot",..: 1 4 3 1 1 4 1 3 4 3 ...
## $ BSBM18C: Factor w/ 4 levels "Agree a lot",..: 2 4 3 2 1 4 2 2 4 2 ...
## $ BSBM18D: Factor w/ 4 levels "Agree a lot",..: 2 4 3 2 1 4 1 2 4 1 ...
## $ BSBM18E: Factor w/ 4 levels "Agree a lot",..: 1 4 3 1 1 3 1 3 4 1 ...
## $ BSBM18F: Factor w/ 4 levels "Agree a lot",..: 1 4 1 1 1 4 2 3 4 3 ...
## $ BSBM18G: Factor w/ 4 levels "Agree a lot",..: 2 4 3 2 1 3 1 3 4 1 ...
## $ BSBM18H: Factor w/ 4 levels "Agree a lot",..: 2 4 1 2 1 4 2 3 4 1 ...
## $ BSBM18I: Factor w/ 4 levels "Agree a lot",..: 2 4 2 1 1 3 1 2 4 2 ...
## $ BSBM18J: Factor w/ 4 levels "Agree a lot",..: 2 4 2 2 1 4 1 3 4 2 ...
## $ BSBM19A: Factor w/ 4 levels "Agree a lot",..: 3 2 3 3 2 2 3 3 3 4 ...
## $ BSBM19B: Factor w/ 4 levels "Agree a lot",..: 2 2 2 2 4 3 2 2 4 3 ...
## $ BSBM19C: Factor w/ 4 levels "Agree a lot",..: 1 2 1 2 4 3 3 1 3 1 ...
## $ BSBM19D: Factor w/ 4 levels "Agree a lot",..: 3 1 1 2 2 3 4 3 1 4 ...
## $ BSBM19E: Factor w/ 4 levels "Agree a lot",..: 1 2 1 3 3 2 1 1 4 2 ...
summary(data_fa)
## BSBM17A BSBM17B BSBM17C
## Agree a lot :2232 Agree a lot : 964 Agree a lot : 610
## Agree a little :2433 Agree a little :1325 Agree a little :1672
## Disagree a little: 766 Disagree a little:1751 Disagree a little:2074
## Disagree a lot : 444 Disagree a lot :1835 Disagree a lot :1519
## BSBM17D BSBM17E BSBM17F
## Agree a lot :1947 Agree a lot :2021 Agree a lot : 947
## Agree a little :2718 Agree a little :2296 Agree a little :2081
## Disagree a little: 938 Disagree a little: 980 Disagree a little:2088
## Disagree a lot : 272 Disagree a lot : 578 Disagree a lot : 759
## BSBM17G BSBM17H BSBM17I
## Agree a lot :1487 Agree a lot :1178 Agree a lot :1924
## Agree a little :2185 Agree a little :2169 Agree a little :1566
## Disagree a little:1457 Disagree a little:1747 Disagree a little:1297
## Disagree a lot : 746 Disagree a lot : 781 Disagree a lot :1088
## BSBM18A BSBM18B BSBM18C
## Agree a lot :2413 Agree a lot :2265 Agree a lot :1551
## Agree a little :3018 Agree a little :2592 Agree a little :2819
## Disagree a little: 365 Disagree a little: 792 Disagree a little:1190
## Disagree a lot : 79 Disagree a lot : 226 Disagree a lot : 315
## BSBM18D BSBM18E BSBM18F
## Agree a lot :1186 Agree a lot :2332 Agree a lot :2688
## Agree a little :2600 Agree a little :2608 Agree a little :2348
## Disagree a little:1647 Disagree a little: 739 Disagree a little: 652
## Disagree a lot : 442 Disagree a lot : 196 Disagree a lot : 187
## BSBM18G BSBM18H BSBM18I
## Agree a lot :1702 Agree a lot :1929 Agree a lot :2376
## Agree a little :2957 Agree a little :2774 Agree a little :2725
## Disagree a little: 987 Disagree a little: 917 Disagree a little: 593
## Disagree a lot : 229 Disagree a lot : 255 Disagree a lot : 181
## BSBM18J BSBM19A BSBM19B
## Agree a lot :2137 Agree a lot :1452 Agree a lot : 749
## Agree a little :2824 Agree a little :2311 Agree a little :1659
## Disagree a little: 693 Disagree a little:1305 Disagree a little:2380
## Disagree a lot : 221 Disagree a lot : 807 Disagree a lot :1087
## BSBM19C BSBM19D BSBM19E
## Agree a lot :1366 Agree a lot :1222 Agree a lot :1115
## Agree a little :1486 Agree a little :2403 Agree a little :1968
## Disagree a little:1664 Disagree a little:1664 Disagree a little:1823
## Disagree a lot :1359 Disagree a lot : 586 Disagree a lot : 969
describe(data_fa)
## vars n mean sd median trimmed mad min max range skew kurtosis
## BSBM17A* 1 5875 1.90 0.90 2 1.78 1.48 1 4 3 0.82 -0.04
## BSBM17B* 2 5875 2.76 1.07 3 2.82 1.48 1 4 3 -0.32 -1.15
## BSBM17C* 3 5875 2.77 0.95 3 2.83 1.48 1 4 3 -0.24 -0.91
## BSBM17D* 4 5875 1.92 0.82 2 1.84 1.48 1 4 3 0.65 -0.06
## BSBM17E* 5 5875 2.02 0.95 2 1.90 1.48 1 4 3 0.65 -0.51
## BSBM17F* 6 5875 2.45 0.91 2 2.44 1.48 1 4 3 0.01 -0.81
## BSBM17G* 7 5875 2.25 0.97 2 2.19 1.48 1 4 3 0.31 -0.89
## BSBM17H* 8 5875 2.36 0.95 2 2.33 1.48 1 4 3 0.16 -0.89
## BSBM17I* 9 5875 2.26 1.10 2 2.20 1.48 1 4 3 0.29 -1.26
## BSBM18A* 10 5875 1.68 0.65 2 1.61 0.00 1 4 3 0.73 0.76
## BSBM18B* 11 5875 1.83 0.80 2 1.73 1.48 1 4 3 0.77 0.12
## BSBM18C* 12 5875 2.05 0.82 2 1.99 1.48 1 4 3 0.49 -0.26
## BSBM18D* 13 5875 2.23 0.85 2 2.19 1.48 1 4 3 0.27 -0.56
## BSBM18E* 14 5875 1.80 0.78 2 1.70 1.48 1 4 3 0.79 0.21
## BSBM18F* 15 5875 1.72 0.78 2 1.61 1.48 1 4 3 0.94 0.40
## BSBM18G* 16 5875 1.96 0.78 2 1.90 0.00 1 4 3 0.57 0.00
## BSBM18H* 17 5875 1.91 0.81 2 1.84 1.48 1 4 3 0.65 -0.01
## BSBM18I* 18 5875 1.76 0.76 2 1.66 1.48 1 4 3 0.86 0.55
## BSBM18J* 19 5875 1.83 0.78 2 1.74 1.48 1 4 3 0.79 0.38
## BSBM19A* 20 5875 2.25 0.98 2 2.19 1.48 1 4 3 0.37 -0.85
## BSBM19B* 21 5875 2.65 0.92 3 2.68 1.48 1 4 3 -0.22 -0.78
## BSBM19C* 22 5875 2.51 1.09 3 2.52 1.48 1 4 3 -0.04 -1.28
## BSBM19D* 23 5875 2.27 0.90 2 2.22 1.48 1 4 3 0.25 -0.72
## BSBM19E* 24 5875 2.45 0.98 2 2.44 1.48 1 4 3 0.06 -1.00
## se
## BSBM17A* 0.01
## BSBM17B* 0.01
## BSBM17C* 0.01
## BSBM17D* 0.01
## BSBM17E* 0.01
## BSBM17F* 0.01
## BSBM17G* 0.01
## BSBM17H* 0.01
## BSBM17I* 0.01
## BSBM18A* 0.01
## BSBM18B* 0.01
## BSBM18C* 0.01
## BSBM18D* 0.01
## BSBM18E* 0.01
## BSBM18F* 0.01
## BSBM18G* 0.01
## BSBM18H* 0.01
## BSBM18I* 0.01
## BSBM18J* 0.01
## BSBM19A* 0.01
## BSBM19B* 0.01
## BSBM19C* 0.01
## BSBM19D* 0.01
## BSBM19E* 0.01
As it can be seen, all variables are coded as factors and don’t have any NA values in them (since we deleted them beforehand). As we have learned in the previous project, it is not recommended to treat a scale of such a small number of points as a numeric variable. Simple descriptive statistics are still provided in the table below. Since mode is not present there and it is quite hard to wrap one’s head around this information, we can visualize the distribution of responses for each factor within each of the variables by using stacked barplots.
Before we continue with the analysis, there are some changes I’d like to make in terms of the scales that were used for the questions. First of all, I’d like to reverse the scales of negatively framed questions in order to make the interpretation better. To illustrate with an example, let’s suppose we have a positively framed question “I like math” and a negatively framed one - “I don’t like math”. For the former, positive attitude towards math would be indicated by the the “Agree a lot / Agree a little” answers but in case of the latter, positive math attitude would be indicated by the “Disagree a lot / Disagree a little” options" (since a person disagrees with a negative proposition about math). We can reverse the scale in order to make “Agree a lot / Agree a little” options indicators of a positive math attitude in both cases.
data_fa$BSBM17C <- ifelse(data_fa$BSBM17C=="Agree a lot", "Disagree a lot",
ifelse(data_fa$BSBM17C=="Agree a little", "Disagree a little",
ifelse(data_fa$BSBM17C=="Disagree a little", "Agree a little", ifelse(data_fa$BSBM17C=="Disagree a lot", "Agree a lot", NA))))
data_fa$BSBM17B <- ifelse(data_fa$BSBM17B=="Agree a lot", "Disagree a lot",
ifelse(data_fa$BSBM17B=="Agree a little", "Disagree a little",
ifelse(data_fa$BSBM17B=="Disagree a little", "Agree a little",
ifelse(data_fa$BSBM17B=="Disagree a lot", "Agree a lot", NA))))
data_fa$BSBM19B <- ifelse(data_fa$BSBM19B=="Agree a lot", "Disagree a lot",
ifelse(data_fa$BSBM19B=="Agree a little", "Disagree a little",
ifelse(data_fa$BSBM19B=="Disagree a little", "Agree a little", ifelse(data_fa$BSBM19B=="Disagree a lot", "Agree a lot", NA))))
data_fa$BSBM19E <- ifelse(data_fa$BSBM19E=="Agree a lot", "Disagree a lot",
ifelse(data_fa$BSBM19E=="Agree a little", "Disagree a little",
ifelse(data_fa$BSBM19E=="Disagree a little", "Agree a little",
ifelse(data_fa$BSBM19E=="Disagree a lot", "Agree a lot", NA))))
data_fa$BSBM19C <- ifelse(data_fa$BSBM19C=="Agree a lot", "Disagree a lot",
ifelse(data_fa$BSBM19C=="Agree a little", "Disagree a little",
ifelse(data_fa$BSBM19C=="Disagree a little", "Agree a little",
ifelse(data_fa$BSBM19C=="Disagree a lot", "Agree a lot", NA))))
data_fa <-as.data.frame(lapply(data_fa, as.factor))
data_fa$BSBM17C <- relevel(data_fa$BSBM17C, "Agree a lot")
data_fa$BSBM17B <- relevel(data_fa$BSBM17B, "Agree a lot")
data_fa$BSBM19B <- relevel(data_fa$BSBM19B, "Agree a lot")
data_fa$BSBM19E <- relevel(data_fa$BSBM19E, "Agree a lot")
data_fa$BSBM19C <- relevel(data_fa$BSBM19C, "Agree a lot")
Secondly, the scale that was used initially used for the questionnarie had 4 levels: from 1 (“Agree a lot”) to 4 (“Disagree a lot”). This particular order seems to be somewhat counterintuitive since we are used to working with scales where one’s satisfaction or level of agreement increases with the increases of the chosen numeric point. If “counterintuitive” scales like that were used, some problems with interpretation would arise during the regression analysis. For instance, if a factor was used as a numeric variable (if the number of levels was higher like 10) the “one-point increase” that is usually used in regression would actually mean the decrease in the concept measured by that factor - this can easly cause misinterpretations. The same would happen if factor scores were used in the regression. I’d like to simply relevel the factors so that the scale is from from 1 (“Disagree a lot”) to 4 (“Agree a lot”) so that such problems are avoided.
# Create a function to relevel factors
fn <- function(x){
x <- factor(x, levels=rev(levels(x)))
}
# Apply to the whole dataset
data_fa <-
data_fa %>%
lapply(fn)
# Double-checking
data_fa <- as.data.frame(data_fa)
str(data_fa)
## 'data.frame': 5875 obs. of 24 variables:
## $ BSBM17A: Factor w/ 4 levels "Disagree a lot",..: 3 2 3 3 4 3 2 3 4 2 ...
## $ BSBM17B: Factor w/ 4 levels "Disagree a lot",..: 3 2 2 3 3 3 2 2 4 1 ...
## $ BSBM17C: Factor w/ 4 levels "Disagree a lot",..: 3 2 2 3 4 3 3 2 4 2 ...
## $ BSBM17D: Factor w/ 4 levels "Disagree a lot",..: 3 4 3 3 3 2 3 3 4 2 ...
## $ BSBM17E: Factor w/ 4 levels "Disagree a lot",..: 3 2 3 3 4 2 3 2 4 1 ...
## $ BSBM17F: Factor w/ 4 levels "Disagree a lot",..: 3 1 2 3 4 2 2 2 3 1 ...
## $ BSBM17G: Factor w/ 4 levels "Disagree a lot",..: 2 3 2 3 4 2 2 2 4 1 ...
## $ BSBM17H: Factor w/ 4 levels "Disagree a lot",..: 3 1 1 3 4 2 3 2 1 1 ...
## $ BSBM17I: Factor w/ 4 levels "Disagree a lot",..: 2 2 1 3 4 2 2 2 4 1 ...
## $ BSBM18A: Factor w/ 4 levels "Disagree a lot",..: 4 1 3 3 4 2 4 3 4 3 ...
## $ BSBM18B: Factor w/ 4 levels "Disagree a lot",..: 4 1 2 4 4 1 4 2 1 2 ...
## $ BSBM18C: Factor w/ 4 levels "Disagree a lot",..: 3 1 2 3 4 1 3 3 1 3 ...
## $ BSBM18D: Factor w/ 4 levels "Disagree a lot",..: 3 1 2 3 4 1 4 3 1 4 ...
## $ BSBM18E: Factor w/ 4 levels "Disagree a lot",..: 4 1 2 4 4 2 4 2 1 4 ...
## $ BSBM18F: Factor w/ 4 levels "Disagree a lot",..: 4 1 4 4 4 1 3 2 1 2 ...
## $ BSBM18G: Factor w/ 4 levels "Disagree a lot",..: 3 1 2 3 4 2 4 2 1 4 ...
## $ BSBM18H: Factor w/ 4 levels "Disagree a lot",..: 3 1 4 3 4 1 3 2 1 4 ...
## $ BSBM18I: Factor w/ 4 levels "Disagree a lot",..: 3 1 3 4 4 2 4 3 1 3 ...
## $ BSBM18J: Factor w/ 4 levels "Disagree a lot",..: 3 1 3 3 4 1 4 2 1 3 ...
## $ BSBM19A: Factor w/ 4 levels "Disagree a lot",..: 2 3 2 2 3 3 2 2 2 1 ...
## $ BSBM19B: Factor w/ 4 levels "Disagree a lot",..: 2 2 2 2 4 3 2 2 4 3 ...
## $ BSBM19C: Factor w/ 4 levels "Disagree a lot",..: 1 2 1 2 4 3 3 1 3 1 ...
## $ BSBM19D: Factor w/ 4 levels "Disagree a lot",..: 2 4 4 3 3 2 1 2 4 1 ...
## $ BSBM19E: Factor w/ 4 levels "Disagree a lot",..: 1 2 1 3 3 2 1 1 4 2 ...
Now we can take a look at how our respondents actually answered to those questions.
p1 <- data_fa %>% select(contains('17')) %>%
plot_stackfrq(geom.colors = "Paired")
p2 <- data_fa %>% select(contains('18')) %>%
plot_stackfrq(geom.colors = "Paired")
p3 <- data_fa %>% select(contains('19')) %>%
plot_stackfrq(geom.colors = "Paired")
ggarrange(p1, p2, p3 + rremove("x.text"),
labels = c("17", "18", "19"),
ncol = 1, nrow = 3)
When it comes to varaibles from the first group (#17, one’s interest in math), “Disagree a lot” seems to be a factor level with the least number of responses within all variables of the group. This indicates that there are few students in Singapore that have a strongly negative attitude towards mathematics.
“Agree a lot” was the most common for a variables BSBM17I - “Mathematics is one of my favourite subjects” - the biggest proportion of responsents in Singapore (32.8%) indicated that math was one of their favourite subjects which is quite interesting, the “Agree a little” category followed it with 26.7%. The latter category was also the most popular for the variable BSBM17C (“Mathematics is boring” reversed): 35.3% of students slightly disagreed with the statement that math is boring. In comparison to 32.8% of students who strongly agreed to the statement of math being one of their favorite subjects, even higher shares of students (34.4% and 38%) chose the “Agree a lot” option when it came to the general questions about them enjoying/liking math. This can be interpreted as even those students that don’t consider math to be their favourite subject, still enjoyed learning it.
The “Agree a little” option was the most popular in case of such variables as BSBM17H, BSBM17G, BSBM17E, BSBM17D, BSBM17C and BSBM17A, while “Disagree a little” was only the most popular in case of the BSBM17F variable (only 0.1% bigger) - “I like mathematics”. It seems like in case of this question, less students were ready to give a confident answer (“Agree a lot” / “Disagree a lot”) and equally chose either of the neutral remaining options. The percentage of students who did say that they like mathematics was still bigger than the share of students who said that they disagree with this statement (16.1% > 12.9%). In general, as it can be seen, students in Singapore tend to have a positive attitude towards learning math.
In case of the second group (#18, one’s asessment of class activities and instructor’s teaching abilities) the pattern was as follows. “Disagree a lot” was the least common choice for all variable with only rising up to 7.5% in case of the BSBM18D (“My teacher gives me interesting things to do”) variable. In case of the latter the “Agree a lot” was actually the second least popular option accounting for only 20.2% of the responses - the lowest proportion compared to other variables. “Agree a lot” option was the most common only in case of the BSBM18F varaible (“My teacher is good at explaining mathematics”), while “Agree a little” was the most common option for all the remaining variables. In short, it can be concluded that students in Singapore are generally satisfied with their class practices and their teacher’s abilities especially when it comes to teacher’s ability to exaplain mathematics - 45.8% of students chose “Agree a lot” and 40% of students chose “Agree a little” for this question. The lowest percentage of students chose “Agree a lot” when it came to the question about their teacher giving them interesting things to do - despite recommendations of the ministry of education, it seems like activities that would be considered interesting by students are still not implemented widely enough.
Finally, in case of the thrid group “Diagree a lot” was still the least popular option for all the variables except for BSBM19C (only 0.1% bigger than “Agree a little” option) variable - “Mathematics is not one of my strengths”. As it can be seen, for the BSBM19C variables the shares of chosen responses are almost equal - 28.3%, 23.1%, 25.3%, 23.2%. It can be hypothesized that despite that many students in Singapore liking math and even choosing it as their favourite subject, it doesn’t mean this subject is actually one of someone’s strengths. “Disagree a little” was the most popular only in case of the BSBM19E variable - “Mathematics makes me nervous” (reversed): that means that initially 33.5% of students indicated “Agree a little” as an answer to that question. In case of remaining variables “Agree a little” was the most popular option. In conclusion, even though students indicated that they like math, choose it as one of their favourite subjects, quickly learn things in mathematics and do well in it it still makes them nervous with some of the students also not considering it to be one of their strengths.
As it was already mentioned, all of the variables in the subset are factors and cannot be simply transformed to the numeric type. We first use hetcor() function that will calculate correlations between our variables taking into account their type and then present it in a form of a corrplot.
check <- hetcor(data_fa)
corrplot(check$correlations)
We observe stronger correlations in the corners of the corrplot and in the center of it. This pattern is, as I personally think, a good sign since we observe that variables from the same group (#17, #18, #19) are correlated with variables of the same group (top left corner #17 and #17, bottom right corner #19 and #19, the center are #18 and #18) - a sign that there will be a latent variable behind those. We also observe that some variables from the group #19 and group #17 are medium correlated (bottom left corner, top right corner) which might seem a bit unusual but can be explained by the possible connection between one’s self-assesment of skills in math and attitude towards the subject itself. For instance, we observe strong correlations between variable of group #17 and variable BSBM19C - “I usuaully do well in mathematics”: the higher one’s assessment of their abilities in math the better is their attitude towards it.
Since we confirmed that the factor analysis can be carried out, we now convert all the variables into numeric type. We has to do that because we want to use the factors in linear regression and if we use the correlation matrix we won’t be able to go back to the scores for each observation. In order to still note that our variables are not numeric we will use the cor=“mixed” specification.
datanum <-as.data.frame(lapply(data_fa, as.numeric))
# Building a screeplot
fa.parallel(datanum)
## Parallel analysis suggests that the number of factors = 4 and the number of components = 3
We proceed to build a screeplot to identify what a suggested number of factors would be. We have to look at those points (triangles) that lie above the corresponding red line since those observed eigenvalues that are higher than coresponding randomly generated ones are more likely to belong to actually meaningful factors. As we observe, there are four of those which is also written in the output. We proceed to use this information and build a model with four factors.
f1 <- fa(datanum, 4, cor="mixed")
##
## mixed.cor is deprecated, please use mixedCor.
f1
## Factor Analysis using method = minres
## Call: fa(r = datanum, nfactors = 4, cor = "mixed")
## Standardized loadings (pattern matrix) based upon correlation matrix
## MR1 MR2 MR3 MR4 h2 u2 com
## BSBM17A 0.87 -0.03 0.07 0.05 0.86 0.138 1.0
## BSBM17B 0.65 -0.05 0.21 0.04 0.63 0.367 1.2
## BSBM17C 0.73 -0.02 0.08 0.03 0.62 0.380 1.0
## BSBM17D 0.79 0.19 -0.10 -0.05 0.65 0.354 1.2
## BSBM17E 0.92 -0.03 0.06 0.01 0.90 0.098 1.0
## BSBM17F 0.85 0.05 -0.04 -0.07 0.67 0.329 1.0
## BSBM17G 0.84 0.01 0.09 -0.02 0.80 0.205 1.0
## BSBM17H 0.84 0.05 -0.10 0.12 0.76 0.241 1.1
## BSBM17I 0.79 -0.04 0.22 0.03 0.88 0.116 1.2
## BSBM18A 0.10 0.26 0.08 0.42 0.53 0.475 1.9
## BSBM18B -0.04 -0.04 0.07 0.94 0.83 0.168 1.0
## BSBM18C 0.37 0.07 -0.17 0.62 0.75 0.250 1.8
## BSBM18D 0.32 0.31 -0.14 0.37 0.66 0.341 3.2
## BSBM18E -0.02 0.24 0.04 0.66 0.74 0.259 1.3
## BSBM18F -0.02 0.30 0.05 0.61 0.76 0.239 1.5
## BSBM18G 0.05 0.65 0.03 0.13 0.61 0.387 1.1
## BSBM18H 0.08 0.81 -0.03 -0.04 0.66 0.342 1.0
## BSBM18I -0.03 0.86 0.06 0.02 0.76 0.241 1.0
## BSBM18J -0.03 0.60 0.07 0.23 0.63 0.373 1.3
## BSBM19A 0.27 0.02 0.63 0.06 0.72 0.279 1.4
## BSBM19B -0.08 0.05 0.86 0.01 0.68 0.324 1.0
## BSBM19C 0.15 0.02 0.80 0.01 0.83 0.170 1.1
## BSBM19D 0.35 -0.01 0.48 0.10 0.62 0.383 1.9
## BSBM19E 0.10 -0.01 0.56 0.00 0.40 0.604 1.1
##
## MR1 MR2 MR3 MR4
## SS loadings 7.24 3.31 3.00 3.39
## Proportion Var 0.30 0.14 0.13 0.14
## Cumulative Var 0.30 0.44 0.56 0.71
## Proportion Explained 0.43 0.20 0.18 0.20
## Cumulative Proportion 0.43 0.62 0.80 1.00
##
## With factor correlations of
## MR1 MR2 MR3 MR4
## MR1 1.00 0.42 0.62 0.48
## MR2 0.42 1.00 0.06 0.80
## MR3 0.62 0.06 1.00 0.20
## MR4 0.48 0.80 0.20 1.00
##
## Mean item complexity = 1.3
## Test of the hypothesis that 4 factors are sufficient.
##
## The degrees of freedom for the null model are 276 and the objective function was 23.57 with Chi Square of 138249.6
## The degrees of freedom for the model are 186 and the objective function was 1.26
##
## The root mean square of the residuals (RMSR) is 0.02
## The df corrected root mean square of the residuals is 0.02
##
## The harmonic number of observations is 5875 with the empirical chi square 1291.38 with prob < 1.2e-164
## The total number of observations was 5875 with Likelihood Chi Square = 7366.6 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.923
## RMSEA index = 0.081 and the 90 % confidence intervals are 0.079 0.083
## BIC = 5752.41
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## MR1 MR2 MR3 MR4
## Correlation of (regression) scores with factors 0.99 0.95 0.95 0.96
## Multiple R square of scores with factors 0.97 0.91 0.91 0.93
## Minimum correlation of possible factor scores 0.94 0.82 0.82 0.86
fa.diagram(f1)
Let’s first take a look at the model fit.
In general, it seems like the fit of the model is quite good. If we take a look at the factor loadings we observe that the majority of variables have distinct main factor loadings and low cross loadings which is good, just as the small u2 values are. If we take a look at the correlations between factors we observe that there is a strong correlation between the second and the forth factors and note that groups of variables that load on them are both from the #18 group. It is quite an interesting finding which will we take a detailed look at while analysing particular variables loaded on these factors. Despite the strong correlation, we observe that variables of the second factor still have their main loadings on it and the difference between their main loadings and cross loadings is still bigger than 0.2.
Now let’s inspect the particular variables loaded on each factor.
MR1<- as.data.frame(datanum[c("BSBM17A", "BSBM17B", "BSBM17C", "BSBM17D", "BSBM17E", "BSBM17F", "BSBM17G", "BSBM17H", "BSBM17I")])
MR2<- as.data.frame(datanum[c("BSBM18I", "BSBM18H", "BSBM18G", "BSBM18J")])
MR3<- as.data.frame(datanum[c("BSBM19A", "BSBM19B", "BSBM19C", "BSBM19D", "BSBM19E")])
MR4<- as.data.frame(datanum[c("BSBM18A", "BSBM18B", "BSBM18C", "BSBM18D", "BSBM18E", "BSBM18F")])
As can be seen, all variables from the #17 group loaded on the first factor. As mentioned in the begininng, variables from these group seem to generally measure one’s attitude towards learning math. Such variables as BSBM17E (“I like mathematics”) and BSBM17A (“I enjoy learning methematics”) with high loadings on this factor are a clear representation of that.
The situation with the third factor is somewhat similar since all varaibles from the #19 group were loaded on it. As it was discussed previously, variables from this group seemed to be connected to the topic of one’s self assessment in terms of their mathematical skills. Interestingly, BSBM19B has the highest loadning on this factor compared to other variables - “Mathematics is more difficult for me than for many of my classmates”. This particular variables has a more of a relative nature since it implies the assesment of one’s skills in comparison to skills of other people (which are also percieved subjectively by the respondent).
Finally, variables from the group #18 were loaded on two different factors: MR2 and MR4. Initially we hypothesized that this group unites variables that are connected to the topic of one’s assessment of class practices and teacher’s skills. Varaibles “BSBM18I”, “BSBM18H”, “BSBM18G” and “BSBM18J” were loaded on the second factor, while the remaining “BSBM18A”, “BSBM18B”, “BSBM18C”, “BSBM18D” and “BSBM18E” variables were loaded on the forth factor. One of the possible explanations for such a division might be as follows: while variables that were loaded on the forth factor have more to do with teacher’s skills as a professional in terms of being able to explain the material in a more easy-to-understand (BSBM18B) or in a more interesting way (BSBM18C, BSBM18D) and answer the questions (BSBM18E), variables that were loaded on the second factor are more focused on teacher’s ability to connect with students, give them feedback and help them. For instance, variables loaded on the second factor include “My teacher lets me show what I’ve learned”, “My teacher does a variety of things to help us learn”, “My teacher tells us how to do better when I make a mistake” and “My teacher listens to what I have to say” - all of these variables indicate teacher’s genuine desire to help students learn while variables from the forth factor are more about the teacher being able to simply give students the information in a one-sided transaction. The BSBM18I ("My teacher tells us how to do better when I make a mistake) has the highest loadning on the second variable indicating that a teacher is ready to give students feedback and engage with them more while the highest loading in case of the forth factor is observed for BSBM18B variable indicating teacher’s ability to present information in an easy to understand way. Since these two factors are still both connected with one’s assesment of teacher’s abilities the correlation is quite high. Previously it was mentioned that one of the cheracteristics of a good math teacher picked by both students and teachers in Singapore was teacher’s willingness to give feedback to individuals or the whole class by using submitted works - that seems to be also connected with the MR2 factor.
As we hypothesized there is also a moderate correlation between one’s self assessment for math skills and attitude towards math. Since the table with correlations is present we can clearly see that the oblique rotation was used.
In conclusion, we could name these four factors as follows:
We can now check the fit of all four scales.
alpha(MR1, check.keys = TRUE)
##
## Reliability analysis
## Call: alpha(x = MR1, check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.94 0.94 0.94 0.65 16 0.0011 2.8 0.79 0.64
##
## lower alpha upper 95% confidence boundaries
## 0.94 0.94 0.94
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## BSBM17A 0.93 0.93 0.93 0.63 14 0.0013 0.0060 0.64
## BSBM17B 0.94 0.94 0.94 0.66 16 0.0012 0.0063 0.65
## BSBM17C 0.94 0.94 0.94 0.66 15 0.0012 0.0069 0.65
## BSBM17D 0.94 0.94 0.94 0.67 16 0.0012 0.0058 0.66
## BSBM17E 0.93 0.93 0.93 0.63 13 0.0014 0.0049 0.63
## BSBM17F 0.94 0.94 0.94 0.66 15 0.0012 0.0069 0.65
## BSBM17G 0.93 0.93 0.93 0.64 14 0.0013 0.0068 0.64
## BSBM17H 0.94 0.94 0.94 0.65 15 0.0012 0.0076 0.64
## BSBM17I 0.93 0.93 0.93 0.63 14 0.0013 0.0058 0.64
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## BSBM17A 5875 0.87 0.88 0.87 0.84 3.1 0.90
## BSBM17B 5875 0.78 0.77 0.74 0.71 2.8 1.07
## BSBM17C 5875 0.79 0.78 0.75 0.73 2.8 0.95
## BSBM17D 5875 0.74 0.75 0.71 0.69 3.1 0.82
## BSBM17E 5875 0.90 0.90 0.90 0.87 3.0 0.95
## BSBM17F 5875 0.79 0.79 0.76 0.73 2.5 0.91
## BSBM17G 5875 0.86 0.86 0.85 0.82 2.8 0.97
## BSBM17H 5875 0.82 0.82 0.79 0.77 2.6 0.95
## BSBM17I 5875 0.89 0.88 0.87 0.85 2.7 1.10
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## BSBM17A 0.08 0.13 0.41 0.38 0
## BSBM17B 0.16 0.23 0.30 0.31 0
## BSBM17C 0.10 0.28 0.35 0.26 0
## BSBM17D 0.05 0.16 0.46 0.33 0
## BSBM17E 0.10 0.17 0.39 0.34 0
## BSBM17F 0.13 0.36 0.35 0.16 0
## BSBM17G 0.13 0.25 0.37 0.25 0
## BSBM17H 0.13 0.30 0.37 0.20 0
## BSBM17I 0.19 0.22 0.27 0.33 0
alpha(MR2, check.keys = TRUE)
##
## Reliability analysis
## Call: alpha(x = MR2, check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.84 0.84 0.8 0.56 5.1 0.0035 3.1 0.64 0.55
##
## lower alpha upper 95% confidence boundaries
## 0.83 0.84 0.84
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## BSBM18I 0.77 0.77 0.69 0.53 3.4 0.0052 0.00036 0.52
## BSBM18H 0.80 0.80 0.73 0.57 3.9 0.0046 0.00190 0.56
## BSBM18G 0.80 0.80 0.73 0.58 4.1 0.0045 0.00308 0.60
## BSBM18J 0.80 0.80 0.73 0.57 4.0 0.0045 0.00082 0.56
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## BSBM18I 5875 0.84 0.85 0.78 0.71 3.2 0.76
## BSBM18H 5875 0.82 0.81 0.72 0.66 3.1 0.81
## BSBM18G 5875 0.80 0.80 0.70 0.64 3.0 0.78
## BSBM18J 5875 0.81 0.81 0.71 0.65 3.2 0.78
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## BSBM18I 0.03 0.10 0.46 0.40 0
## BSBM18H 0.04 0.16 0.47 0.33 0
## BSBM18G 0.04 0.17 0.50 0.29 0
## BSBM18J 0.04 0.12 0.48 0.36 0
alpha(MR3, check.keys = TRUE)
##
## Reliability analysis
## Call: alpha(x = MR3, check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.86 0.86 0.84 0.54 6 0.0029 2.6 0.78 0.52
##
## lower alpha upper 95% confidence boundaries
## 0.85 0.86 0.86
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## BSBM19A 0.81 0.81 0.78 0.52 4.3 0.0040 0.0102 0.50
## BSBM19B 0.82 0.82 0.80 0.54 4.7 0.0037 0.0145 0.54
## BSBM19C 0.80 0.80 0.76 0.50 4.0 0.0043 0.0092 0.48
## BSBM19D 0.83 0.83 0.80 0.55 5.0 0.0035 0.0117 0.52
## BSBM19E 0.86 0.86 0.84 0.61 6.2 0.0029 0.0061 0.62
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## BSBM19A 5875 0.84 0.84 0.80 0.73 2.8 0.98
## BSBM19B 5875 0.80 0.80 0.74 0.68 2.6 0.92
## BSBM19C 5875 0.87 0.86 0.84 0.77 2.5 1.09
## BSBM19D 5875 0.77 0.78 0.71 0.65 2.7 0.90
## BSBM19E 5875 0.70 0.70 0.57 0.53 2.5 0.98
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## BSBM19A 0.14 0.22 0.39 0.25 0
## BSBM19B 0.13 0.28 0.41 0.19 0
## BSBM19C 0.23 0.25 0.28 0.23 0
## BSBM19D 0.10 0.28 0.41 0.21 0
## BSBM19E 0.19 0.33 0.31 0.16 0
alpha(MR4, check.keys = TRUE)
##
## Reliability analysis
## Call: alpha(x = MR4, check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.89 0.89 0.89 0.58 8.3 0.0021 3.1 0.63 0.56
##
## lower alpha upper 95% confidence boundaries
## 0.89 0.89 0.9
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## BSBM18A 0.89 0.89 0.88 0.62 8.3 0.0023 0.0051 0.61
## BSBM18B 0.86 0.86 0.85 0.56 6.4 0.0027 0.0092 0.55
## BSBM18C 0.87 0.87 0.85 0.57 6.6 0.0027 0.0087 0.56
## BSBM18D 0.88 0.88 0.86 0.59 7.2 0.0024 0.0076 0.58
## BSBM18E 0.87 0.87 0.85 0.57 6.6 0.0026 0.0074 0.56
## BSBM18F 0.87 0.87 0.85 0.57 6.7 0.0026 0.0066 0.56
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## BSBM18A 5875 0.70 0.72 0.62 0.59 3.3 0.65
## BSBM18B 5875 0.85 0.85 0.81 0.77 3.2 0.80
## BSBM18C 5875 0.84 0.83 0.80 0.75 3.0 0.82
## BSBM18D 5875 0.80 0.79 0.73 0.69 2.8 0.85
## BSBM18E 5875 0.83 0.83 0.80 0.75 3.2 0.78
## BSBM18F 5875 0.83 0.83 0.79 0.74 3.3 0.78
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## BSBM18A 0.01 0.06 0.51 0.41 0
## BSBM18B 0.04 0.13 0.44 0.39 0
## BSBM18C 0.05 0.20 0.48 0.26 0
## BSBM18D 0.08 0.28 0.44 0.20 0
## BSBM18E 0.03 0.13 0.44 0.40 0
## BSBM18F 0.03 0.11 0.40 0.46 0
For all four factors standartized alpha is higher than the 0.7 treshold (0.94, 0.84, 0.86, 0.89) wich indicates a good scale reliability. Moreover, when the reliability if an item is dropped is checked, there are no variables by dropping which higher alpha values could be reached which means that we shouldn’t get rid of any of them.
In order to use factors as variables we add the scores back to the dataset.
fascores <-as.data.frame(f1$scores)
data_reg<-cbind(data_reg,fascores)
Since how we will be using a different dataset, she should take a look at the variables that it contains.
data_reg$BSMMAT01 <-as.numeric(as.character(data_reg$BSMMAT01))
# In case we want to plot both math scores and some of the particular variables associated with factors
data_fa$math <- data_reg$BSMMAT01
str(data_reg)
## 'data.frame': 5875 obs. of 9 variables:
## $ BSMMAT01: num 595 587 575 637 624 ...
## $ BSBG01 : Factor w/ 2 levels "Girl","Boy": 1 2 1 2 2 2 1 1 2 2 ...
## $ BSBG07A : Factor w/ 8 levels "Some Primary or Lower secondary or did not go to school",..: 3 1 4 3 3 8 3 2 6 5 ...
## $ BSBG07B : Factor w/ 8 levels "Some Primary or Lower secondary or did not go to school",..: 8 3 4 8 5 5 8 2 2 6 ...
## $ BSBG10A : Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 1 1 1 1 ...
## $ MR1 : num -0.142 -1.135 -0.802 0.147 1.325 ...
## $ MR2 : num 0.264 -3.513 -0.224 0.423 1.32 ...
## $ MR3 : num -1.1475 -0.0965 -0.9033 -0.3982 1.0043 ...
## $ MR4 : num 0.804 -3.447 -1.049 0.691 1.331 ...
We observe that all the variables are coded as factors while, in fact, BSMMAT01 (Mathematics achievement) is a numeric variable. Test items include questions on such content domains as Numbers, Algebra, Geometry, Data and Chance as well as such cognitive domains as Knowing, Applying and Reasoning. We proceed to code it as numeric and take a look at it’s descriptive statistics and describution. This variable will be used as a dependent variables in the regression in what follows.
We will first take a look at all the variables in the data_reg dataset to get an overview of what is going on, but then only choose specific variables for the regression analysis accordingly to the research question.
describe(data_reg$BSMMAT01)
## vars n mean sd median trimmed mad min max range skew
## X1 1 5875 616.5 82.12 627.86 622.18 77.65 321.8 845.98 524.17 -0.62
## kurtosis se
## X1 0.14 1.07
ggplot(data_reg, aes(x=BSMMAT01))+
geom_histogram(fill="dodgerblue4", binwidth= 1)+
labs(title = "Distribution of BSMMAT01 variable", y="Frequency", x="BSMMAT01")+
theme_bw()
As we observe, the values range from 321.8 to 845.98 with the mean being equal to 616.5 and median equal to 627.86. Median is bigger than the mean and our distribution appears to be a bit skewed: there are more observations with high BSMMAT01 values. That is also indicated by the negative value of the skew. The distribution looks quite normal despite it’s skewness.
The remaining variables are: gender (BSBG01), whether or not a respondent was born in the country (BSBG10A) and level of educational attainment of both parents (BSBG07A and BSBG07B). Let’s proceed to visualizing distributions of these variables and their bivariative distributions together with the the math achievement variable.
## vars n mean sd median trimmed mad min max range skew
## X1 1 2884 621.55 77.14 632.57 627.32 73.95 329.44 845.98 516.53 -0.67
## kurtosis se
## X1 0.24 1.44
## vars n mean sd median trimmed mad min max range skew
## X1 1 2991 611.64 86.39 622.83 616.91 80.53 321.8 832.77 510.96 -0.54
## kurtosis se
## X1 0 1.58
The sample is balanced in terms of gender of the respondents: there are only sligthly more boys in it (50.9 > 49.1). As we observe on the boxplot, median math score is higher for girls than it is for boys (632.57 > 622.83). Girls also have this one exremely high value outlier.
Since there are only two groups, we can carry out the T-test instead of performing the ANOVA.
One of the assumptions we have to check before carrying out the test itself is the homogeneity of variances. The two hypothesis would be:
# Test for homogeneity of variances
var.test(as.numeric(data_reg$BSMMAT01) ~ data_reg$BSBG01)
##
## F test to compare two variances
##
## data: as.numeric(data_reg$BSMMAT01) by data_reg$BSBG01
## F = 0.79729, num df = 2883, denom df = 2990, p-value = 9.034e-10
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.7416470 0.8571393
## sample estimates:
## ratio of variances
## 0.7972854
As we can see, the p-value is equal to 9.034e-10 and extremely small (<0.05). We would have to reject the null hypothesis and conclude that, unfortunately, the variances do differ. We will have to specify that in the T-test formula so that it performs the Welch’s t-test.
Hypotheses for the test itself:
t.test(as.numeric(data_reg$BSMMAT01) ~ data_reg$BSBG01, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: as.numeric(data_reg$BSMMAT01) by data_reg$BSBG01
## t = 4.6433, df = 5838.8, p-value = 3.504e-06
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 5.727969 14.098676
## sample estimates:
## mean in group Girl mean in group Boy
## 621.5495 611.6362
Since the p-value is small (3.504e-06 < 0.05) the null hypothesis shall be rejected (because the probability to get the result we have if H0 was true is very low). Therefore, we conclude that the difference between means of these two groups is statistically significant.
## vars n mean sd median trimmed mad min max range skew
## X1 1 5103 614.26 83.15 625.07 619.88 79.27 321.8 845.98 524.17 -0.59
## kurtosis se
## X1 0.07 1.16
## vars n mean sd median trimmed mad min max range skew
## X1 1 772 631.3 73.33 640.52 635.99 69.07 353.54 820.99 467.45 -0.69
## kurtosis se
## X1 0.65 2.64
We observe that there are only 13.1% of respondents who were not born in Singapore. Their median math scores are, in fact, higher than that of the other group (631.3 > 614.26) which is a quite interesting finding. Similarly to gender we can quickly check if the difference in their means is statistically significant.
# Test for homogeneity of variances
var.test(as.numeric(data_reg$BSMMAT01) ~ data_reg$BSBG10A)
##
## F test to compare two variances
##
## data: as.numeric(data_reg$BSMMAT01) by data_reg$BSBG10A
## F = 1.2857, num df = 5102, denom df = 771, p-value = 8.829e-06
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 1.152527 1.428018
## sample estimates:
## ratio of variances
## 1.285654
# T-test
t.test(as.numeric(data_reg$BSMMAT01) ~ data_reg$BSBG10A, var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: as.numeric(data_reg$BSMMAT01) by data_reg$BSBG10A
## t = -5.9082, df = 1093.8, p-value = 4.615e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -22.70139 -11.38210
## sample estimates:
## mean in group Yes mean in group No
## 614.2632 631.3050
We arrive at the same conclusion: because variances were not different we specified that in the t.test formula and, because of the small p-value for the t-test itself, had to reject the null hypothesis and conclude that the difference in means is statistically significant.
The biggest share or respondents simply reported not knowing the level of education of their parents (29.7% for mother’s education, 32% for father’s education). Following that, relatively big shares of respondents reported their parents having either Bachelor’s degree or Upper secondary school diploma. While in case of father’s education the shares of these two categories were very similar (14.9% and 14.7%) for mother’s education more respondents reported the “Upper secondary” category (14.6% and 18.9%). Very similar shares of respondents reported their parents having either Short-cycle teritary or Post-secondary, non-teritary education (9.5% and 9.7% for mother’s education, 9.2% and 9.9% for father’s education). Shares in case of such categories as “Lower secondary” and “Some Primary or Lower secondary/did not go to school” were also similar for both parents. While only 4.4% of respondents reported their mothers having post-graduate degree, 7.7% of respondents reported their fathers having it.
By looking at the boxplot, we can note that the general pattern is extremely similar. The only difference that is immediately noticed is that in case of the “Post-secondary, non-teritary” category, students with mothers having that level of education had a higher median math scare in comparison to students with fathers having the same level of education.
We can qucikly check if the difference in educational level is significant in both cases. Since in this case we are dealing with more than two groups we have to use ANOVA.
## Df Sum Sq Mean Sq F value Pr(>F)
## data_reg$BSBG07A 7 3778092 539727 88.37 <2e-16 ***
## Residuals 5867 35833407 6108
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df Sum Sq Mean Sq F value Pr(>F)
## data_reg$BSBG07B 7 4047844 578263 95.4 <2e-16 ***
## Residuals 5867 35563655 6062
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In both cases P-value < .001 (<2e-16) means that the null hypothesis should be rejected (the probability to obtain such data that we have if H0 was true for the population is low), thus, the difference in lmath scores across different levels of education of parents is statistically significant.
MR3
The MR3 factor, as it was previously identified, is related to one’s self-assessment of math skills. According to the study by Erturan and Jansen (2015), while there was no observed relation between math anxiety and math performance, perceived math competence was significantly related to math achievement for both genders. This variables will be included in the regression as a predictor with a hypothesis that higher self-assessment of one’s math skills is associated with higher math achievement. As it can be seen on the boxplot math scores of those who reported thinking they are good at math are higher.
ggplot(data_fa, aes(x = BSBM19A, y = math, fill = BSBM19A)) +
geom_boxplot() +
scale_fill_brewer(palette = "Pastel2") +
labs(title = "Differences in math achievement accordingly to the BSBM19A variable", x = "I do well in math", y = "Math achievement") +
theme_bw()+
theme(legend.position = "none")
As it was highlighted previosuly, despite the fact that students in Singapore have a generally good attitude towards math, students still reported being nervous because of math (“Agree a lot” - 19%, “Agree a little” - 33.5%). In general, in comprison to the answers to questions of the #17 block, students chose “Agree a lot” option less when it came to the #18 group of question. Even though students in Singapore like math it doesn’t neccessarily mean they assess their own math skills high.
MR2
It seems quite obvious that there might be a positive correlation between both MR2 and MR4 factor scores (assessing teacher’s skills) and student’s math achievement. However, it might be unclear whether MR2 (connection with the class, willingness to give feedback, engage in various activities) or MR4 (general teaching skills, skills of presenting the information in an easy-to-understand way) is a better predictor of students’ math achievement. We can quickly test it by constructing two linear regression models with each of these factors as predictors and compare them using AIC (since the models are non-nested).
cor.test(data_reg$MR2, data_reg$BSMMAT01)
##
## Pearson's product-moment correlation
##
## data: data_reg$MR2 and data_reg$BSMMAT01
## t = 7.0663, df = 5873, p-value = 1.776e-12
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.06640092 0.11711354
## sample estimates:
## cor
## 0.09181677
cor.test(data_reg$MR4, data_reg$BSMMAT01)
##
## Pearson's product-moment correlation
##
## data: data_reg$MR4 and data_reg$BSMMAT01
## t = 13.559, df = 5873, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1493167 0.1989087
## sample estimates:
## cor
## 0.1742232
model1 <- lm(BSMMAT01 ~ MR2, data = data_reg)
model2 <- lm(BSMMAT01 ~ MR4, data = data_reg)
summary(model1)
##
## Call:
## lm(formula = BSMMAT01 ~ MR2, data = data_reg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -292.90 -47.24 11.76 58.98 229.30
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 616.503 1.067 577.825 < 2e-16 ***
## MR2 7.429 1.051 7.066 1.78e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 81.78 on 5873 degrees of freedom
## Multiple R-squared: 0.00843, Adjusted R-squared: 0.008261
## F-statistic: 49.93 on 1 and 5873 DF, p-value: 1.776e-12
summary(model2)
##
## Call:
## lm(formula = BSMMAT01 ~ MR4, data = data_reg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -290.95 -47.24 11.47 58.07 229.21
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 616.503 1.055 584.32 <2e-16 ***
## MR4 14.022 1.034 13.56 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 80.87 on 5873 degrees of freedom
## Multiple R-squared: 0.03035, Adjusted R-squared: 0.03019
## F-statistic: 183.8 on 1 and 5873 DF, p-value: < 2.2e-16
AIC(model1)
## [1] 68423.78
AIC(model2)
## [1] 68292.43
We observe that AIC is lower for the second model - the one with general teaching skills as a predictor. We might hypothesize that, though MR2 is still positively associated with math achievement, high MR2 scores might have more to do with students personally liking the teacher and less with students’ math scores. As we just found out, the positive correlation of MR4 and math achievement was higher (0.1742232 > 0.09181677): high assessment of teacher’s general skills to present information was associated with higher math scores of students.
plot1 <- ggplot(data_reg, aes(x=MR4))+
geom_histogram(fill="lightblue1", binwidth= 1)+
labs(title = "Distribution of MR4 variable", y="Frequency", x="MR4")+
theme_bw()
plot2 <- ggplot(data_reg, aes(x=MR3))+
geom_histogram(fill="steelblue2", binwidth= 1)+
labs(title = "Distribution of MR3 variable", y="Frequency", x="MR3")+
theme_bw()
ggarrange(plot1, plot2 + rremove("x.text"),
ncol = 1, nrow = 2)
If we visualize the distribution of MR3 and MR4 variables, we observe that while MR3 distirbution is quite normal, MR4 distribution is skewed - more students assessed skills of their teachers high.
Gender
As it was mentioned previously, median and mean math achievement was higher for girls. However, as it was found by Cvencek, Meltzoff and Kapur (2013) the gender-math stereotype was still present in Singapore: boys associated math with their own gender while girls associated math with the opposite gender more than they did with their own (math self-concept (me = math)); both girls and boys associated math with boys more than with girls (math–gender stereotype (male = math)). The findings were the same for both explicit self-report and implicit Child IAT measures. Additionally, as authors also mention, math self-concepts of Singaporean boys were stronger than those of American boys while American girls had stronger math–gender stereotype (math = male stereotypes) than did Singaporean girls. It is quite puzzling that despite the fact that both genders have high math achievemnt scores (in comparison to other countries) and girls score higher than boys, the gender stereotype is still held by both genders. It seems like actually observed math achievement has nothing to do with these commonly held believes. In order to investigate the link between gender and mathematical achievement, gender will be included in the regression model as an interaction term. While we believe that in the regression output we will see that the math achievement decreases when a respondent is a boy, we hypothesize that the interaction effect of gender is statistically significant and the effect of MR3 on math achievement is stronger for boys: girls might be, on average, simply more dilligent and recieve good scores regardless of their self-assessment while boys might need that confidence boost to compensate for the lack of hardwork.
Research question
Instead of working with variables that are often internationally used as predictors of math achievement, such as father’s education, I wanted to focus on those variables that are more culture-specific and apply to students from Singapore.
For instance, as it was previously discussed, one of the possible reasons behind such high math skills of students in Singapore is the good quality of education provided by their teachers. As we have found out, the MR4 factor scores were a better predictor of math achievement in comparison to MR2 factor. Therefore, MR4 factor is added to the final model.
As it was already highlighted, both girls are boys in Singapore perform extremelly well. Girls, in fact, perform better than boys and gender differences in performance are statistically significant. At the same time, previous reserach on the topic shows that the gender - math stereptype is still present in Singapore with female students associating themselves with math less (me - math) and both boys and girls assosiating male students with math more (male - math). I wanted to add MR3 factor (self-assessment of math skills) and it’s interraction with gender. We have established that students in Singapore generally have a good attitude towards math but student’s perception of their own math skills was not as uniformal: while for the majority of questions from the #19 group more than 20% of students choose “Agree a lot” option (implying them assessing their math skills as good), the only question with less than 20% “Agree” a lot was the question about student’s feeling like math makes them nervous. Despite the fact that students in Singapore have a generally positive attitude towards math, their responses to questions from the self-assessment block vary with students assessing their own skills high having higher math achievement scores. In order to see if one’s self-assessment is a good predictor of math achievement it will be included in the regression equation.
The main research question would be to see if gender as an interaction term (when it comes to the self-assessment) is statistically significant and to check whether the addition of MR3 factor (self-assesment) with it’s interaction with gender would improve the model based solely on the MR4 (teacher’s skills) factor.
In other words, I want to see:
Is it simply good education in Singapore that can be used as a predictor of math achievement of students or their self-assessment is also a statistically significant predictor that can improve the model.
Does gender influence the relationship between math achievement and one’s self-assessment.
model3 <- lm(BSMMAT01 ~ MR3 * BSBG01 + MR4, data = data_reg)
summary(model3)
##
## Call:
## lm(formula = BSMMAT01 ~ MR3 * BSBG01 + MR4, data = data_reg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -284.60 -43.79 9.08 52.02 212.56
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 623.5890 1.3785 452.383 < 2e-16 ***
## MR3 29.5287 1.3891 21.257 < 2e-16 ***
## BSBG01Boy -14.4612 1.9319 -7.485 8.18e-14 ***
## MR4 5.8353 0.9759 5.979 2.37e-09 ***
## MR3:BSBG01Boy 8.3751 1.9441 4.308 1.67e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 73.86 on 5870 degrees of freedom
## Multiple R-squared: 0.1915, Adjusted R-squared: 0.191
## F-statistic: 347.7 on 4 and 5870 DF, p-value: < 2.2e-16
We observe that the model explains 19% of the variance in the dependent variable and is better than the model based on mean scores (p-value: < 2.2e-16). All predictors are statistically significant with a ‘***’ significance code (0).
The regression equation would look like this:
\[Predicted math score = 623.5890 + 29.5287 * MR3 - 14.4612 * GenderBoy + 5.8353 * MR4 + 8.3751 * MR3 * GenderBoy \]
Since we should only interpret the statistically significant variables the interpretation would be:
We can test if adding MR3 and gender imporved the model in comparison to the one with MR4 as the only predictor.
anova(model2, model3)
## Analysis of Variance Table
##
## Model 1: BSMMAT01 ~ MR4
## Model 2: BSMMAT01 ~ MR3 * BSBG01 + MR4
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 5873 38409143
## 2 5870 32024276 3 6384867 390.11 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P-value is small (<0.05) which means that the addition of a new predictor DID improve the model in comparison to the one built solely on teacher’s skills (assessed by students).
plot_model(model3, type = "int")
We also want to take a look at the graph where the interaction effect is depicted. As we observe, while the line is ascending (positive relationship) for both boys and girls, it goes up faster for boys. The lines are not parallel - the effect of MR3 on math scores is different for different genders and the effect is stronger for boys.
gg <- ggplot(data_reg, aes(x=BSBG01, y = MR3, color = BSBG01))+
geom_boxplot()+
labs(title = "Gender difference in self-assessment")+
theme_bw()
gg2 <- ggplot(data_reg, aes(x=MR3, y = BSMMAT01, color = BSBG01))+
geom_point()+
labs(title = "Actual math scores and self-assessment")+
theme_bw()+
geom_smooth(method = "lm")
ggarrange(gg, gg2 + rremove("x.text"),
ncol = 2, nrow = 1)
The median MR3 (self-assessment) score was higher for boys despite the fact that girls actually peformed better at math. As it was mentioned, since the math-gender stereotypes that are still present in Singapore, boys might report higher MR3 scores not because they are actually objectively better than girls (since that is not true) but simply because of the me-gender, me-math, gender-math stereotypical mechanisms at work.
We have found out that the effect of MR3 on math scores was actually stronger for boys. If we take a look at the second graph we observe that when it comes to the low to medium MR3 scores, girls show better results, but as me move towards MR3 scores boys perform only slightly worse than girls. At some point the lines intersect and at the highest MR3 scores boys actually perform slightly better.
Let’s illustrate it with an example by dividing the MR3 scores into low, medium and high. In case when both girls and boys report low self-assessed math skills girls obtain better math results: this might because of boys being strongly affected by their low self-assessment and not being able to connect to the mentioned math-gender steretype. In other words, when a male student reports a low MR3 score and considers himself to be bad at math, they subsonsciously cannot use the gender-math stereotype: the stereotype suggests that boys are good at math but male students who don’t feel like they are feel down and don’t perform as good as girls. The higher their MR3 gets, the more they connect with that stereotype - now they assess themselves as good at math, feel confident and perform equally as good as girls do. At high levels of MR3 boys perform just as good as girls or even better.
It has to be mentioned that despite the fact that there is a positive relationship between math achievement and MR3 variable, it might be debatable of whether it’s students’ self-assessment influencing their math performance or their performance making them asnwer to the #19 questions in a certain way. Let’s suppose the latter is the case and plot the relationshio between these variables the other way around.
ggplot(data_reg, aes(x=BSMMAT01, y = MR3, color = BSBG01))+
geom_point()+
labs(title = "Actual math scores and self-assessment")+
theme_bw()+
geom_smooth(method = "lm")
We observe that the lines are almost parallel with boys reporting higher MR3 scores regardless of how high their real math scores are. It seems like students assess themselves quite objectively - the higher thair math score is, the higher is their self assessment. there are no instances of girl scoring the same with boys but reporting higher MR3 scores: the gender stereptypes still make boys connect with the me-male, male-math conceprs more and report higher values on questions from te MR3 factor.
We might first check for outliers.
outlierTest(model3)
## No Studentized residuals with Bonferroni p < 0.05
## Largest |rstudent|:
## rstudent unadjusted p-value Bonferroni p
## 6042 -3.858925 0.00011511 0.67626
par(mfrow=c(1,1))
qqPlot(model3, main="QQ Plot")
## 6027 6042
## 5791 5805
We observe that observation 6042 is the most extreme in our data and it is an outlier (since p-value <0.05 and we reject the null hypothesis). We observe observations 6027 and 6042 on the plot.
In order to check for multicollinearity we calculate variance inflation factors for all predictor variables.
vif(model3)
## MR3 BSBG01 MR4 MR3:BSBG01
## 2.051529 1.004473 1.067469 1.981401
Values for all four variables (2.051529, 1.004473, 1.067469, 1.981401) are less than 5 which indicates that everything is okay. The smallest value of the variance inflation factor is 1 and as we can see values of vif for all three variables are not that much bigger than that. We would be concerned if the values exceed 5 or 10.
We then proceed to check for heteroscedasticity of residuals.
ncvTest(model3)
## Non-constant Variance Score Test
## Variance formula: ~ fitted.values
## Chisquare = 117.4804, Df = 1, p = < 2.22e-16
Unfortunately, p-value is statistically significant (<0.05) which is a sign of heteroscedasticity (the residuals don’t have a constant variance). We can try to solve this problem.
mathBCMod <- caret::BoxCoxTrans(data_reg$BSMMAT01)
print(mathBCMod)
## Box-Cox Transformation
##
## 5875 data points used to estimate Lambda
##
## Input data summary:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 321.8 568.8 627.9 616.5 675.2 846.0
##
## Largest/Smallest: 2.63
## Sample Skewness: -0.615
##
## Estimated Lambda: 2
# transformation value is "estimated Lambda"
data_reg$BSMMAT011 <- predict(mathBCMod, data_reg$BSMMAT01)
hist(data_reg$BSMMAT01)
hist(data_reg$BSMMAT011)
# and repeat the model4
model4 <- lm(BSMMAT011 ~ MR3 * BSBG01 + MR4, data = data_reg)
ncvTest(model4)
## Non-constant Variance Score Test
## Variance formula: ~ fitted.values
## Chisquare = 27.13621, Df = 1, p = 1.8961e-07
Unfortunately, that didn’t help.
par(mfrow=c(2,2))
plot(model4)
When it comes to the “Residuals vs Fitted” graph that is used to check the linear relationship we are looking for a straight horizontal line. That seems to be the case of our model here.
The second graph is used to check the normality of residual distribution and in ideal case the dots would follow a straight dashed line: the points generated based on our model actually follow the line very nicely.
“Scale-Location” graph is used to check the homogeneity of variance of the residuals and the ideal case would be a horizontal line with points equally spread around it - the line in our case is not ideally horisontal and goes down at the end.
There are no points on the Line of Cook’s distance which means there are no high influence outliers that should be deleted from the model.
The questionnarie used by TIMMS appeared to be a good methodological tool - the factors behind particular sets of variables were easily identified, variables of the same group were loaded on the same factor. One of the intersting findings was that questions from the #18 group were divided into two groups and loaded on different factors. We have suggested that factor MR2 was more connected to teacher’s desire to engage, interact with students and give them feedback while factor M4 was more about teacher’s skills ar presenting the information in an easy-to-understand, clear way.
Following the assumption that one of the reasons behind students in Singapore is the high quality of skills of their teachers, we have tested whether the correlation with math scores was higher for MR2 or MR4 factors and found out that while the relationship between math achievement of students and both factors of teacher’s skills was positive in both cases, the strenght of this relationship was higher for the MR4 factor. Linear models with these two factors are predictors were compared and judging by the ANOVA output, the model based on the MR4 factor performed better.
The addition of MR3 factor and gender as an interactive term significantly imporved the model and the interaction effect was found to be statistically significant as well. The effect of MR3 on math achievement was stronger for boys and we hypothesize that boys who assess their math skills low feel like they cannot connect to the gender-math stereotype, get discouraged and perform worse than the girls who reported the same self-assessment score. At higher self-assessment values, boys connect with the gender-math stereotype more, become more confident and perform equally good as girls and even better.
Cvencek, D., Meltzoff, A. N., Kapur, M. (2014). Cognitive consistency and math–gender stereotypes in Singaporean children, Journal of Experimental Child Psychology, 117, 73-91.
Erturan, S., Jansen, B. (2015). An investigation of boys’ and girls’ emotional experience of math, their math performance, and the relation between these variables. European Journal of Psychology of Education, 30(4), 421-435.
Kaur, B. (2010). Towards Excellence in Mathematics Education–Singapore’s Experience. Procedia - Social and Behavioral Sciences, 8, 28-34
Sclafani, S. (2015). Singapore chooses teachers carefully. The Phi Delta Kappan, 97(3), 8-13.
https://www.economist.com/leaders/2018/08/30/what-other-countries-can-learn-from-singapores-schools