Aim of the project: The present study aims to determine what influences mathematics achivement among New Zealander pupils. In the following analysis, we seek to discover the exact factors of the named achievement, controlling also for the influence of gender, country of birth and parental education.
Research question: What factors influence a student’s achievement in mathematics?
library(foreign)
library(psych)
library(polycor)
library(car)
library(ggplot2)
library(ggpubr)
nzl <- data.frame(ndata$BSBM17A, ndata$BSBM17B, ndata$BSBM17C, ndata$BSBM17D, ndata$BSBM17E, ndata$BSBM17F, ndata$BSBM17G, ndata$BSBM17H, ndata$BSBM17I, ndata$BSBM18A, ndata$BSBM18B, ndata$BSBM18C, ndata$BSBM18D, ndata$BSBM18E, ndata$BSBM18F, ndata$BSBM18G, ndata$BSBM18H, ndata$BSBM18I, ndata$BSBM18J, ndata$BSBM19A, ndata$BSBM19B, ndata$BSBM19C, ndata$BSBM19D, ndata$BSBM19E, ndata$BSMMAT01, ndata$BSDGEDUP, ndata$ITSEX, ndata$BSBG10A)
names(nzl) <- c("BSBM17A", "BSBM17B", "BSBM17C", "BSBM17D", "BSBM17E", "BSBM17F", "BSBM17G", "BSBM17H", "BSBM17I", "BSBM18A", "BSBM18B", "BSBM18C", "BSBM18D", "BSBM18E", "BSBM18F", "BSBM18G", "BSBM18H", "BSBM18I", "BSBM18J", "BSBM19A", "BSBM19B", "BSBM19C", "BSBM19D", "BSBM19E", "BSMMAT01", "paredu", "gender", "country")
nzl <- subset(nzl, nzl$BSBM17A != "NA")
nzl <- subset(nzl, nzl$BSBM17B != "NA")
nzl <- subset(nzl, nzl$BSBM17C != "NA")
nzl <- subset(nzl, nzl$BSBM17D != "NA")
nzl <- subset(nzl, nzl$BSBM17E != "NA")
nzl <- subset(nzl, nzl$BSBM17F != "NA")
nzl <- subset(nzl, nzl$BSBM17G != "NA")
nzl <- subset(nzl, nzl$BSBM17H != "NA")
nzl <- subset(nzl, nzl$BSBM17I != "NA")
nzl <- subset(nzl, nzl$BSBM18A != "NA")
nzl <- subset(nzl, nzl$BSBM18B != "NA")
nzl <- subset(nzl, nzl$BSBM18C != "NA")
nzl <- subset(nzl, nzl$BSBM18D != "NA")
nzl <- subset(nzl, nzl$BSBM18E != "NA")
nzl <- subset(nzl, nzl$BSBM18F != "NA")
nzl <- subset(nzl, nzl$BSBM18G != "NA")
nzl <- subset(nzl, nzl$BSBM18H != "NA")
nzl <- subset(nzl, nzl$BSBM18I != "NA")
nzl <- subset(nzl, nzl$BSBM18J != "NA")
nzl <- subset(nzl, nzl$BSBM19A != "NA")
nzl <- subset(nzl, nzl$BSBM19B != "NA")
nzl <- subset(nzl, nzl$BSBM19C != "NA")
nzl <- subset(nzl, nzl$BSBM19D != "NA")
nzl <- subset(nzl, nzl$BSBM19E != "NA")
nzl <- subset(nzl, nzl$BSMMAT01 != "NA")
nzl <- subset(nzl, nzl$gender != "NA")
nzl <- subset(nzl, nzl$country != "NA")
nzl <- subset(nzl, nzl$paredu != "NA")
| Variable | Description and type |
|---|---|
| BSMMAT01 | math achievement - continuous |
| gender | gender of a pupil - categorical |
| country | whether one was born in New Zealand - binary |
| paredu | education level of parents - categorical |
| BSBM17A | enjoys learning mathematics - ordinal |
| BSBM17B | wish have not to study math - ordinal |
| BSBM17C | math is boring - ordinal |
| BSBM17D | learn interesting things - ordinal |
| BSBM17E | like mathematics - ordinal |
| BSBM17F | like numbers - ordinal |
| BSBM17G | like math problems - ordinal |
| BSBM17H | look forward to math classes - ordinal |
| BSBM17I | favourite subject - ordinal |
| BSBM18A | teacher expects to do - ordinal |
| BSBM18B | teacher is easy to undestand - ordinal |
| BSBM18C | interested in what teacher says - ordinal |
| BSBM18D | interesting things to do - ordinal |
| BSBM18E | teacher clear answers - ordinal |
| BSBM18F | teacher explains good - ordinal |
| BSBM18G | teacher shows learned - ordinal |
| BSBM18H | different things to help - ordinal |
| BSBM18I | tells how to do better - ordinal |
| BSBM18J | teacher listens - ordinal |
| BSBM19A | usually do well in math - ordinal |
| BSBM19B | mathematics is more difficult - ordinal |
| BSBM19C | mathematics is not my strength - ordinal |
| BSBM19D | learn quickly in mathematics - ordinal |
| BSBM19E | math makes nervous - ordinal |
describe(nzl, na.rm = TRUE)
vars n mean sd median trimmed mad min max range skew
BSBM17A* 1 7290 2.19 0.94 2.0 2.12 1.48 1 4 3 0.44
BSBM17B* 2 7290 2.83 1.04 3.0 2.91 1.48 1 4 3 -0.37
BSBM17C* 3 7290 2.44 0.96 2.0 2.42 1.48 1 4 3 0.13
BSBM17D* 4 7290 2.10 0.88 2.0 2.03 1.48 1 4 3 0.45
BSBM17E* 5 7290 2.29 0.97 2.0 2.23 1.48 1 4 3 0.32
BSBM17F* 6 7290 2.63 0.90 3.0 2.66 1.48 1 4 3 -0.17
BSBM17G* 7 7290 2.43 0.98 2.0 2.41 1.48 1 4 3 0.10
BSBM17H* 8 7290 2.69 0.96 3.0 2.73 1.48 1 4 3 -0.20
BSBM17I* 9 7290 2.77 1.10 3.0 2.83 1.48 1 4 3 -0.34
BSBM18A* 10 7290 1.66 0.71 2.0 1.55 1.48 1 4 3 0.92
BSBM18B* 11 7290 1.94 0.92 2.0 1.84 1.48 1 4 3 0.67
BSBM18C* 12 7290 2.15 0.90 2.0 2.09 1.48 1 4 3 0.37
BSBM18D* 13 7290 2.29 0.93 2.0 2.24 1.48 1 4 3 0.22
BSBM18E* 14 7290 2.00 0.93 2.0 1.91 1.48 1 4 3 0.58
BSBM18F* 15 7290 1.86 0.92 2.0 1.74 1.48 1 4 3 0.81
BSBM18G* 16 7290 2.15 0.90 2.0 2.08 1.48 1 4 3 0.38
BSBM18H* 17 7290 1.90 0.89 2.0 1.79 1.48 1 4 3 0.73
BSBM18I* 18 7290 1.84 0.88 2.0 1.73 1.48 1 4 3 0.82
BSBM18J* 19 7290 1.88 0.91 2.0 1.75 1.48 1 4 3 0.82
BSBM19A* 20 7290 2.02 0.83 2.0 1.94 1.48 1 4 3 0.60
BSBM19B* 21 7290 2.69 0.96 3.0 2.73 1.48 1 4 3 -0.21
BSBM19C* 22 7290 2.45 1.08 2.0 2.44 1.48 1 4 3 0.05
BSBM19D* 23 7290 2.29 0.89 2.0 2.24 1.48 1 4 3 0.19
BSBM19E* 24 7290 2.80 0.96 3.0 2.88 1.48 1 4 3 -0.32
BSMMAT01* 25 7290 4190.75 2342.11 4271.5 4213.60 2997.82 1 8127 8126 -0.07
paredu* 26 7290 4.16 2.10 6.0 4.32 0.00 1 6 5 -0.45
gender* 27 7290 1.46 0.50 1.0 1.45 0.00 1 2 1 0.17
country* 28 7290 1.16 0.37 1.0 1.08 0.00 1 2 1 1.84
kurtosis se
BSBM17A* -0.68 0.01
BSBM17B* -1.10 0.01
BSBM17C* -0.94 0.01
BSBM17D* -0.50 0.01
BSBM17E* -0.88 0.01
BSBM17F* -0.73 0.01
BSBM17G* -0.98 0.01
BSBM17H* -0.92 0.01
BSBM17I* -1.22 0.01
BSBM18A* 0.60 0.01
BSBM18B* -0.45 0.01
BSBM18C* -0.64 0.01
BSBM18D* -0.81 0.01
BSBM18E* -0.60 0.01
BSBM18F* -0.29 0.01
BSBM18G* -0.65 0.01
BSBM18H* -0.28 0.01
BSBM18I* -0.11 0.01
BSBM18J* -0.20 0.01
BSBM19A* -0.07 0.01
BSBM19B* -0.90 0.01
BSBM19C* -1.26 0.01
BSBM19D* -0.73 0.01
BSBM19E* -0.89 0.01
BSMMAT01* -1.19 27.43
paredu* -1.55 0.02
gender* -1.97 0.01
country* 1.38 0.00
All variables contain results, there are no empty variables. Similarly, no NA’s remained. Overall, there are 7290 observations.
summary(as.numeric(as.character(nzl$BSMMAT01)))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 197.3 444.8 510.0 505.5 568.0 752.6
Gender
summary(nzl$gender)
## Female Male
## 3946 3344
ggplot(nzl, aes(as.numeric(as.character(BSMMAT01)))) + geom_histogram(fill = "lightsteelblue1", color = "skyblue2") + labs(x = "Math achievement") + facet_grid(~gender) + theme_pubclean()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
In general, there are more girls than boys in the sampling, so it’s hard to assess their differences. Anyway, the results are similar for these two groups, and the distribution of achievement is skewed but slightly close to normality.
Parental education
summary(nzl$paredu)
## University or Higher
## 1443
## Post-secondary but not University
## 764
## Upper Secondary
## 790
## Lower Secondary
## 345
## Some Primary, Lower Secondary or No School
## 93
## Don't Know
## 3855
Noting in advance that there are too many cases of ‘Don’t know’; the data is thus rather biased.
## Let us remove don't know answers for the plot
parenteduc <- as.data.frame(nzl)
parenteduc <- subset(parenteduc, parenteduc$paredu != "Don't Know")
ggplot(parenteduc, aes(as.numeric(as.character(BSMMAT01)))) + geom_histogram(fill = "lightsteelblue1", color = "skyblue2") + labs(x = "Math achievement") + facet_grid(~paredu) + theme_pubclean()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
More or less similar trend can be found in distribution for all parental education levels (except Some Primary, Lower Secondary or No School) . But still the number of cases differ significantly across the categories of education of parents; cannot be compared.
Born in the country or outside
ggplot(nzl, aes(as.numeric(as.character(BSMMAT01)))) + geom_histogram(color = "skyblue2", fill = "lightsteelblue1") + labs(x = "Math achievement") + facet_grid(~country) + theme_pubclean()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The majority of respondents were born in the country. However, the distribution of achievements is almost normal, a little left skewed.
Variables from BSBM17A to BSBM19E have four levels - “Disagree a lot”, Disagree a little“,”Agree a little" and “Agree a lot”.
#For instance,
summary(nzl$BSBM17A)
## Agree a lot Agree a little Disagree a little Disagree a lot
## 1826 3075 1539 850
All variables except BSMMAT01 contain non-numeric data, so we need to change them.
##
## 1 2 3 4
## Agree a lot 0 0 0 1826
## Agree a little 0 0 3075 0
## Disagree a little 0 1539 0 0
## Disagree a lot 850 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 979
## Agree a little 0 0 1756 0
## Disagree a little 0 2078 0 0
## Disagree a lot 2477 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1304
## Agree a little 0 0 2695 0
## Disagree a little 0 2102 0 0
## Disagree a lot 1189 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1947
## Agree a little 0 0 3196 0
## Disagree a little 0 1621 0 0
## Disagree a lot 526 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1687
## Agree a little 0 0 2853 0
## Disagree a little 0 1730 0 0
## Disagree a lot 1020 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 846
## Agree a little 0 0 2238 0
## Disagree a little 0 2959 0 0
## Disagree a lot 1247 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1404
## Agree a little 0 0 2532 0
## Disagree a little 0 2181 0 0
## Disagree a lot 1173 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 930
## Agree a little 0 0 2070 0
## Disagree a little 0 2647 0 0
## Disagree a lot 1643 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1296
## Agree a little 0 0 1546 0
## Disagree a little 0 2017 0 0
## Disagree a lot 2431 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 3406
## Agree a little 0 0 3117 0
## Disagree a little 0 631 0 0
## Disagree a lot 136 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 2777
## Agree a little 0 0 2690 0
## Disagree a little 0 1308 0 0
## Disagree a lot 515 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1883
## Agree a little 0 0 3025 0
## Disagree a little 0 1795 0 0
## Disagree a lot 587 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1583
## Agree a little 0 0 2804 0
## Disagree a little 0 2115 0 0
## Disagree a lot 788 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 2575
## Agree a little 0 0 2693 0
## Disagree a little 0 1441 0 0
## Disagree a lot 581 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 3169
## Agree a little 0 0 2477 0
## Disagree a little 0 1144 0 0
## Disagree a lot 500 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1893
## Agree a little 0 0 3012 0
## Disagree a little 0 1784 0 0
## Disagree a lot 601 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 2851
## Agree a little 0 0 2793 0
## Disagree a little 0 1193 0 0
## Disagree a lot 453 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 3083
## Agree a little 0 0 2718 0
## Disagree a little 0 1072 0 0
## Disagree a lot 417 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 3023
## Agree a little 0 0 2678 0
## Disagree a little 0 1064 0 0
## Disagree a lot 525 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 2009
## Agree a little 0 0 3594 0
## Disagree a little 0 1248 0 0
## Disagree a lot 439 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 925
## Agree a little 0 0 2057 0
## Disagree a little 0 2676 0 0
## Disagree a lot 1632 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1758
## Agree a little 0 0 2018 0
## Disagree a little 0 1956 0 0
## Disagree a lot 1558 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 1476
## Agree a little 0 0 2920 0
## Disagree a little 0 2212 0 0
## Disagree a lot 682 0 0 0
##
## 1 2 3 4
## Agree a lot 0 0 0 791
## Agree a little 0 0 1873 0
## Disagree a little 0 2614 0 0
## Disagree a lot 2012 0 0 0
nzld <- data.frame(nzl$BSBM17A1, nzl$BSBM17B1, nzl$BSBM17C1, nzl$BSBM17D1, nzl$BSBM17E1, nzl$BSBM17F1, nzl$BSBM17G1, nzl$BSBM17H1, nzl$BSBM17I1, nzl$BSBM18A1, nzl$BSBM18B1, nzl$BSBM18C1, nzl$BSBM18D1, nzl$BSBM18E1, nzl$BSBM18F1, nzl$BSBM18G1, nzl$BSBM18H1, nzl$BSBM18I1, nzl$BSBM18J1, nzl$BSBM19A1, nzl$BSBM19B1, nzl$BSBM19C1, nzl$BSBM19D1, nzl$BSBM19E1, nzl$BSMMAT01, nzl$paredu, nzl$gender, nzl$country)
names(nzld) <- c("BSBM17A1", "BSBM17B1", "BSBM17C1", "BSBM17D1", "BSBM17E1", "BSBM17F1", "BSBM17G1", "BSBM17H1", "BSBM17I1", "BSBM18A1", "BSBM18B1", "BSBM18C1", "BSBM18D1", "BSBM18E1", "BSBM18F1", "BSBM18G1", "BSBM18H1", "BSBM18I1", "BSBM18J1", "BSBM19A1", "BSBM19B1", "BSBM19C1", "BSBM19D1", "BSBM19E1", "BSMMAT01", "paredu", "gender", "country")
Special data frame for factor analysis:
nzldfa <- data.frame(nzl$BSBM17A1, nzl$BSBM17B1, nzl$BSBM17C1, nzl$BSBM17D1, nzl$BSBM17E1, nzl$BSBM17F1, nzl$BSBM17G1, nzl$BSBM17H1, nzl$BSBM17I1, nzl$BSBM18A1, nzl$BSBM18B1, nzl$BSBM18C1, nzl$BSBM18D1, nzl$BSBM18E1, nzl$BSBM18F1, nzl$BSBM18G1, nzl$BSBM18H1, nzl$BSBM18I1, nzl$BSBM18J1, nzl$BSBM19A1, nzl$BSBM19B1, nzl$BSBM19C1, nzl$BSBM19D1, nzl$BSBM19E1)
names(nzldfa) <- c("BSBM17A1", "BSBM17B1", "BSBM17C1", "BSBM17D1", "BSBM17E1", "BSBM17F1", "BSBM17G1", "BSBM17H1", "BSBM17I1", "BSBM18A1", "BSBM18B1", "BSBM18C1", "BSBM18D1", "BSBM18E1", "BSBM18F1", "BSBM18G1", "BSBM18H1", "BSBM18I1", "BSBM18J1", "BSBM19A1", "BSBM19B1", "BSBM19C1", "BSBM19D1", "BSBM19E1")
Checking whether all variales are numeric.
sapply(nzldfa, class)
## BSBM17A1 BSBM17B1 BSBM17C1 BSBM17D1 BSBM17E1 BSBM17F1 BSBM17G1 BSBM17H1
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## BSBM17I1 BSBM18A1 BSBM18B1 BSBM18C1 BSBM18D1 BSBM18E1 BSBM18F1 BSBM18G1
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## BSBM18H1 BSBM18I1 BSBM18J1 BSBM19A1 BSBM19B1 BSBM19C1 BSBM19D1 BSBM19E1
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
They are, so we can proceed further.
Getting correlations:
dat.cor <- hetcor(nzldfa)
dat.cor <- dat.cor$correlations
dat.cor
## BSBM17A1 BSBM17B1 BSBM17C1 BSBM17D1 BSBM17E1 BSBM17F1
## BSBM17A1 1.0000000 -0.5768786 -0.5655810 0.62834768 0.8323609 0.6577364
## BSBM17B1 -0.5768786 1.0000000 0.6158734 -0.43266234 -0.5896195 -0.4771128
## BSBM17C1 -0.5655810 0.6158734 1.0000000 -0.43504951 -0.5901481 -0.4974803
## BSBM17D1 0.6283477 -0.4326623 -0.4350495 1.00000000 0.6340212 0.5346317
## BSBM17E1 0.8323609 -0.5896195 -0.5901481 0.63402123 1.0000000 0.7051352
## BSBM17F1 0.6577364 -0.4771128 -0.4974803 0.53463175 0.7051352 1.0000000
## BSBM17G1 0.7029365 -0.5114426 -0.5080860 0.55586174 0.7438328 0.7492198
## BSBM17H1 0.7053954 -0.5004388 -0.5648015 0.57592856 0.7227450 0.6573554
## BSBM17I1 0.7137881 -0.5194981 -0.5652939 0.52938661 0.7476362 0.6590231
## BSBM18A1 0.3363467 -0.2328558 -0.2310628 0.36018515 0.3329031 0.2722640
## BSBM18B1 0.3746519 -0.2722510 -0.2699911 0.38349082 0.3569243 0.2877919
## BSBM18C1 0.5318730 -0.3969848 -0.4103481 0.54303880 0.5183505 0.4348276
## BSBM18D1 0.5019071 -0.3545114 -0.4035136 0.52746717 0.4887656 0.4200531
## BSBM18E1 0.3707794 -0.2754567 -0.2819333 0.40988309 0.3578290 0.2964591
## BSBM18F1 0.3589903 -0.2623412 -0.2669856 0.39848855 0.3498627 0.2732231
## BSBM18G1 0.3546839 -0.2485478 -0.2548801 0.38031003 0.3493751 0.2966898
## BSBM18H1 0.3533003 -0.2447316 -0.2742504 0.41714836 0.3366282 0.2738079
## BSBM18I1 0.3301120 -0.2314396 -0.2416543 0.39139863 0.3190449 0.2508407
## BSBM18J1 0.3373492 -0.2614111 -0.2608939 0.38161483 0.3266244 0.2615856
## BSBM19A1 0.5236116 -0.3786238 -0.3447032 0.36535013 0.5399734 0.4798487
## BSBM19B1 -0.2882632 0.3131022 0.2807667 -0.14560072 -0.3106195 -0.2747911
## BSBM19C1 -0.4269891 0.4209089 0.3797490 -0.24700572 -0.4521639 -0.4142184
## BSBM19D1 0.5329095 -0.3856843 -0.3548225 0.38459148 0.5494962 0.5033834
## BSBM19E1 -0.2066615 0.2427975 0.2208979 -0.08275696 -0.2174522 -0.1875799
## BSBM17G1 BSBM17H1 BSBM17I1 BSBM18A1 BSBM18B1 BSBM18C1
## BSBM17A1 0.7029365 0.7053954 0.7137881 0.33634674 0.3746519 0.53187304
## BSBM17B1 -0.5114426 -0.5004388 -0.5194981 -0.23285583 -0.2722510 -0.39698479
## BSBM17C1 -0.5080860 -0.5648015 -0.5652939 -0.23106281 -0.2699911 -0.41034806
## BSBM17D1 0.5558617 0.5759286 0.5293866 0.36018515 0.3834908 0.54303880
## BSBM17E1 0.7438328 0.7227450 0.7476362 0.33290307 0.3569243 0.51835047
## BSBM17F1 0.7492198 0.6573554 0.6590231 0.27226399 0.2877919 0.43482759
## BSBM17G1 1.0000000 0.6707390 0.6695092 0.29207641 0.2940582 0.45755313
## BSBM17H1 0.6707390 1.0000000 0.7524251 0.33600329 0.3911449 0.54875669
## BSBM17I1 0.6695092 0.7524251 1.0000000 0.28088590 0.3289157 0.45875566
## BSBM18A1 0.2920764 0.3360033 0.2808859 1.00000000 0.5748823 0.49294712
## BSBM18B1 0.2940582 0.3911449 0.3289157 0.57488229 1.0000000 0.62727427
## BSBM18C1 0.4575531 0.5487567 0.4587557 0.49294712 0.6272743 1.00000000
## BSBM18D1 0.4305047 0.5403949 0.4550318 0.46144318 0.5822664 0.72089200
## BSBM18E1 0.3043997 0.3975143 0.3330995 0.50944991 0.7150788 0.60465410
## BSBM18F1 0.2833863 0.3782298 0.3131238 0.49297448 0.7360868 0.58119967
## BSBM18G1 0.3045929 0.3864228 0.3177633 0.44607371 0.5228219 0.52357457
## BSBM18H1 0.2787567 0.3857662 0.3107298 0.45646831 0.5636863 0.52498541
## BSBM18I1 0.2612306 0.3446488 0.2837535 0.48483722 0.5842734 0.52307537
## BSBM18J1 0.2734775 0.3652734 0.2932462 0.47911980 0.6064992 0.54430381
## BSBM19A1 0.5272103 0.4412476 0.4929663 0.28231396 0.3016033 0.32385135
## BSBM19B1 -0.3129336 -0.2252724 -0.3207609 -0.10071192 -0.1403170 -0.12848220
## BSBM19C1 -0.4558835 -0.3484632 -0.4870410 -0.14830497 -0.1875101 -0.20933838
## BSBM19D1 0.5504441 0.4542709 0.5150073 0.28191586 0.3096618 0.32978748
## BSBM19E1 -0.2022902 -0.1726701 -0.2435620 -0.09137876 -0.1266766 -0.07749327
## BSBM18D1 BSBM18E1 BSBM18F1 BSBM18G1 BSBM18H1 BSBM18I1
## BSBM17A1 0.50190710 0.3707794 0.35899028 0.35468388 0.35330027 0.33011196
## BSBM17B1 -0.35451145 -0.2754567 -0.26234118 -0.24854785 -0.24473156 -0.23143962
## BSBM17C1 -0.40351355 -0.2819333 -0.26698557 -0.25488009 -0.27425041 -0.24165425
## BSBM17D1 0.52746717 0.4098831 0.39848855 0.38031003 0.41714836 0.39139863
## BSBM17E1 0.48876561 0.3578290 0.34986270 0.34937507 0.33662825 0.31904492
## BSBM17F1 0.42005308 0.2964591 0.27322313 0.29668984 0.27380793 0.25084071
## BSBM17G1 0.43050466 0.3043997 0.28338629 0.30459285 0.27875668 0.26123058
## BSBM17H1 0.54039490 0.3975143 0.37822977 0.38642283 0.38576615 0.34464885
## BSBM17I1 0.45503181 0.3330995 0.31312376 0.31776332 0.31072983 0.28375347
## BSBM18A1 0.46144318 0.5094499 0.49297448 0.44607371 0.45646831 0.48483722
## BSBM18B1 0.58226643 0.7150788 0.73608679 0.52282187 0.56368633 0.58427341
## BSBM18C1 0.72089200 0.6046541 0.58119967 0.52357457 0.52498541 0.52307537
## BSBM18D1 1.00000000 0.6149318 0.57286780 0.55786548 0.60357276 0.54771173
## BSBM18E1 0.61493178 1.0000000 0.76576403 0.56896006 0.60517071 0.62876994
## BSBM18F1 0.57286780 0.7657640 1.00000000 0.56667565 0.62260931 0.63453985
## BSBM18G1 0.55786548 0.5689601 0.56667565 1.00000000 0.59988532 0.58241733
## BSBM18H1 0.60357276 0.6051707 0.62260931 0.59988532 1.00000000 0.65809505
## BSBM18I1 0.54771173 0.6287699 0.63453985 0.58241733 0.65809505 1.00000000
## BSBM18J1 0.54135589 0.6343709 0.62067304 0.55935865 0.59049838 0.66680277
## BSBM19A1 0.30510304 0.2688822 0.25062297 0.23569958 0.20480309 0.22471300
## BSBM19B1 -0.09853654 -0.1146815 -0.10124379 -0.07872459 -0.04968707 -0.06754569
## BSBM19C1 -0.19474441 -0.1666128 -0.15273425 -0.12267112 -0.09309949 -0.11485944
## BSBM19D1 0.31829632 0.2916006 0.28047162 0.27629340 0.23601229 0.24453325
## BSBM19E1 -0.06507009 -0.1029988 -0.08884738 -0.04894775 -0.02844946 -0.05621497
## BSBM18J1 BSBM19A1 BSBM19B1 BSBM19C1 BSBM19D1 BSBM19E1
## BSBM17A1 0.33734922 0.5236116 -0.28826323 -0.42698905 0.5329095 -0.20666147
## BSBM17B1 -0.26141114 -0.3786238 0.31310220 0.42090892 -0.3856843 0.24279751
## BSBM17C1 -0.26089394 -0.3447032 0.28076670 0.37974896 -0.3548225 0.22089787
## BSBM17D1 0.38161483 0.3653501 -0.14560072 -0.24700572 0.3845915 -0.08275696
## BSBM17E1 0.32662437 0.5399734 -0.31061953 -0.45216385 0.5494962 -0.21745219
## BSBM17F1 0.26158556 0.4798487 -0.27479110 -0.41421841 0.5033834 -0.18757992
## BSBM17G1 0.27347749 0.5272103 -0.31293359 -0.45588352 0.5504441 -0.20229021
## BSBM17H1 0.36527343 0.4412476 -0.22527236 -0.34846323 0.4542709 -0.17267014
## BSBM17I1 0.29324619 0.4929663 -0.32076087 -0.48704098 0.5150073 -0.24356201
## BSBM18A1 0.47911980 0.2823140 -0.10071192 -0.14830497 0.2819159 -0.09137876
## BSBM18B1 0.60649920 0.3016033 -0.14031704 -0.18751012 0.3096618 -0.12667656
## BSBM18C1 0.54430381 0.3238514 -0.12848220 -0.20933838 0.3297875 -0.07749327
## BSBM18D1 0.54135589 0.3051030 -0.09853654 -0.19474441 0.3182963 -0.06507009
## BSBM18E1 0.63437094 0.2688822 -0.11468152 -0.16661284 0.2916006 -0.10299880
## BSBM18F1 0.62067304 0.2506230 -0.10124379 -0.15273425 0.2804716 -0.08884738
## BSBM18G1 0.55935865 0.2356996 -0.07872459 -0.12267112 0.2762934 -0.04894775
## BSBM18H1 0.59049838 0.2048031 -0.04968707 -0.09309949 0.2360123 -0.02844946
## BSBM18I1 0.66680277 0.2247130 -0.06754569 -0.11485944 0.2445332 -0.05621497
## BSBM18J1 1.00000000 0.2433607 -0.08253412 -0.12916724 0.2460654 -0.08517851
## BSBM19A1 0.24336071 1.0000000 -0.45243917 -0.53607512 0.6505018 -0.30194684
## BSBM19B1 -0.08253412 -0.4524392 1.00000000 0.63602971 -0.4484224 0.44065611
## BSBM19C1 -0.12916724 -0.5360751 0.63602971 1.00000000 -0.5180123 0.42977445
## BSBM19D1 0.24606539 0.6505018 -0.44842244 -0.51801227 1.0000000 -0.27456968
## BSBM19E1 -0.08517851 -0.3019468 0.44065611 0.42977445 -0.2745697 1.00000000
The result is rather immense and messy, so let’s make it look it a bit clearer.
mcor <- round(cor(dat.cor),2)
upper <- mcor
upper[upper.tri(mcor)] <- ""
upper <- as.data.frame(upper)
upper
## BSBM17A1 BSBM17B1 BSBM17C1 BSBM17D1 BSBM17E1 BSBM17F1 BSBM17G1
## BSBM17A1 1
## BSBM17B1 -0.97 1
## BSBM17C1 -0.97 0.96 1
## BSBM17D1 0.94 -0.94 -0.94 1
## BSBM17E1 0.99 -0.97 -0.97 0.94 1
## BSBM17F1 0.96 -0.94 -0.94 0.9 0.97 1
## BSBM17G1 0.97 -0.95 -0.94 0.91 0.98 0.98 1
## BSBM17H1 0.97 -0.96 -0.97 0.93 0.98 0.96 0.96
## BSBM17I1 0.98 -0.95 -0.96 0.91 0.98 0.97 0.97
## BSBM18A1 0.66 -0.71 -0.71 0.71 0.65 0.6 0.61
## BSBM18B1 0.66 -0.71 -0.71 0.72 0.64 0.58 0.59
## BSBM18C1 0.84 -0.87 -0.87 0.88 0.82 0.78 0.78
## BSBM18D1 0.81 -0.85 -0.86 0.86 0.8 0.75 0.76
## BSBM18E1 0.65 -0.71 -0.71 0.72 0.63 0.58 0.58
## BSBM18F1 0.62 -0.69 -0.68 0.7 0.61 0.55 0.55
## BSBM18G1 0.66 -0.71 -0.71 0.72 0.64 0.59 0.6
## BSBM18H1 0.62 -0.68 -0.69 0.71 0.6 0.54 0.55
## BSBM18I1 0.59 -0.65 -0.65 0.67 0.57 0.51 0.52
## BSBM18J1 0.62 -0.68 -0.68 0.69 0.6 0.54 0.55
## BSBM19A1 0.9 -0.87 -0.85 0.81 0.9 0.89 0.91
## BSBM19B1 -0.84 0.79 0.77 -0.76 -0.85 -0.84 -0.86
## BSBM19C1 -0.91 0.86 0.84 -0.83 -0.91 -0.91 -0.92
## BSBM19D1 0.9 -0.88 -0.86 0.82 0.91 0.9 0.91
## BSBM19E1 -0.78 0.73 0.71 -0.71 -0.78 -0.77 -0.78
## BSBM17H1 BSBM17I1 BSBM18A1 BSBM18B1 BSBM18C1 BSBM18D1 BSBM18E1
## BSBM17A1
## BSBM17B1
## BSBM17C1
## BSBM17D1
## BSBM17E1
## BSBM17F1
## BSBM17G1
## BSBM17H1 1
## BSBM17I1 0.98 1
## BSBM18A1 0.67 0.63 1
## BSBM18B1 0.68 0.63 0.9 1
## BSBM18C1 0.86 0.81 0.85 0.9 1
## BSBM18D1 0.84 0.79 0.84 0.9 0.97 1
## BSBM18E1 0.67 0.62 0.87 0.97 0.9 0.9 1
## BSBM18F1 0.65 0.6 0.86 0.97 0.88 0.89 0.98
## BSBM18G1 0.68 0.63 0.84 0.89 0.87 0.89 0.91
## BSBM18H1 0.65 0.59 0.83 0.9 0.86 0.89 0.92
## BSBM18I1 0.61 0.56 0.85 0.91 0.85 0.86 0.93
## BSBM18J1 0.64 0.59 0.86 0.93 0.87 0.87 0.94
## BSBM19A1 0.86 0.9 0.62 0.61 0.72 0.69 0.58
## BSBM19B1 -0.81 -0.85 -0.61 -0.62 -0.69 -0.67 -0.59
## BSBM19C1 -0.88 -0.92 -0.63 -0.63 -0.74 -0.72 -0.61
## BSBM19D1 0.87 0.9 0.62 0.62 0.73 0.7 0.59
## BSBM19E1 -0.76 -0.79 -0.62 -0.63 -0.67 -0.65 -0.6
## BSBM18F1 BSBM18G1 BSBM18H1 BSBM18I1 BSBM18J1 BSBM19A1 BSBM19B1
## BSBM17A1
## BSBM17B1
## BSBM17C1
## BSBM17D1
## BSBM17E1
## BSBM17F1
## BSBM17G1
## BSBM17H1
## BSBM17I1
## BSBM18A1
## BSBM18B1
## BSBM18C1
## BSBM18D1
## BSBM18E1
## BSBM18F1 1
## BSBM18G1 0.91 1
## BSBM18H1 0.93 0.92 1
## BSBM18I1 0.94 0.91 0.94 1
## BSBM18J1 0.94 0.91 0.92 0.95 1
## BSBM19A1 0.55 0.57 0.52 0.51 0.54 1
## BSBM19B1 -0.57 -0.58 -0.52 -0.52 -0.55 -0.94 1
## BSBM19C1 -0.58 -0.6 -0.54 -0.54 -0.57 -0.96 0.95
## BSBM19D1 0.57 0.59 0.53 0.53 0.55 0.96 -0.94
## BSBM19E1 -0.58 -0.57 -0.53 -0.54 -0.57 -0.85 0.85
## BSBM19C1 BSBM19D1 BSBM19E1
## BSBM17A1
## BSBM17B1
## BSBM17C1
## BSBM17D1
## BSBM17E1
## BSBM17F1
## BSBM17G1
## BSBM17H1
## BSBM17I1
## BSBM18A1
## BSBM18B1
## BSBM18C1
## BSBM18D1
## BSBM18E1
## BSBM18F1
## BSBM18G1
## BSBM18H1
## BSBM18I1
## BSBM18J1
## BSBM19A1
## BSBM19B1
## BSBM19C1 1
## BSBM19D1 -0.96 1
## BSBM19E1 0.85 -0.84 1
library(knitr)
library(kableExtra)
upper %>%
kable() %>%
kable_styling()
| BSBM17A1 | BSBM17B1 | BSBM17C1 | BSBM17D1 | BSBM17E1 | BSBM17F1 | BSBM17G1 | BSBM17H1 | BSBM17I1 | BSBM18A1 | BSBM18B1 | BSBM18C1 | BSBM18D1 | BSBM18E1 | BSBM18F1 | BSBM18G1 | BSBM18H1 | BSBM18I1 | BSBM18J1 | BSBM19A1 | BSBM19B1 | BSBM19C1 | BSBM19D1 | BSBM19E1 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BSBM17A1 | 1 | |||||||||||||||||||||||
| BSBM17B1 | -0.97 | 1 | ||||||||||||||||||||||
| BSBM17C1 | -0.97 | 0.96 | 1 | |||||||||||||||||||||
| BSBM17D1 | 0.94 | -0.94 | -0.94 | 1 | ||||||||||||||||||||
| BSBM17E1 | 0.99 | -0.97 | -0.97 | 0.94 | 1 | |||||||||||||||||||
| BSBM17F1 | 0.96 | -0.94 | -0.94 | 0.9 | 0.97 | 1 | ||||||||||||||||||
| BSBM17G1 | 0.97 | -0.95 | -0.94 | 0.91 | 0.98 | 0.98 | 1 | |||||||||||||||||
| BSBM17H1 | 0.97 | -0.96 | -0.97 | 0.93 | 0.98 | 0.96 | 0.96 | 1 | ||||||||||||||||
| BSBM17I1 | 0.98 | -0.95 | -0.96 | 0.91 | 0.98 | 0.97 | 0.97 | 0.98 | 1 | |||||||||||||||
| BSBM18A1 | 0.66 | -0.71 | -0.71 | 0.71 | 0.65 | 0.6 | 0.61 | 0.67 | 0.63 | 1 | ||||||||||||||
| BSBM18B1 | 0.66 | -0.71 | -0.71 | 0.72 | 0.64 | 0.58 | 0.59 | 0.68 | 0.63 | 0.9 | 1 | |||||||||||||
| BSBM18C1 | 0.84 | -0.87 | -0.87 | 0.88 | 0.82 | 0.78 | 0.78 | 0.86 | 0.81 | 0.85 | 0.9 | 1 | ||||||||||||
| BSBM18D1 | 0.81 | -0.85 | -0.86 | 0.86 | 0.8 | 0.75 | 0.76 | 0.84 | 0.79 | 0.84 | 0.9 | 0.97 | 1 | |||||||||||
| BSBM18E1 | 0.65 | -0.71 | -0.71 | 0.72 | 0.63 | 0.58 | 0.58 | 0.67 | 0.62 | 0.87 | 0.97 | 0.9 | 0.9 | 1 | ||||||||||
| BSBM18F1 | 0.62 | -0.69 | -0.68 | 0.7 | 0.61 | 0.55 | 0.55 | 0.65 | 0.6 | 0.86 | 0.97 | 0.88 | 0.89 | 0.98 | 1 | |||||||||
| BSBM18G1 | 0.66 | -0.71 | -0.71 | 0.72 | 0.64 | 0.59 | 0.6 | 0.68 | 0.63 | 0.84 | 0.89 | 0.87 | 0.89 | 0.91 | 0.91 | 1 | ||||||||
| BSBM18H1 | 0.62 | -0.68 | -0.69 | 0.71 | 0.6 | 0.54 | 0.55 | 0.65 | 0.59 | 0.83 | 0.9 | 0.86 | 0.89 | 0.92 | 0.93 | 0.92 | 1 | |||||||
| BSBM18I1 | 0.59 | -0.65 | -0.65 | 0.67 | 0.57 | 0.51 | 0.52 | 0.61 | 0.56 | 0.85 | 0.91 | 0.85 | 0.86 | 0.93 | 0.94 | 0.91 | 0.94 | 1 | ||||||
| BSBM18J1 | 0.62 | -0.68 | -0.68 | 0.69 | 0.6 | 0.54 | 0.55 | 0.64 | 0.59 | 0.86 | 0.93 | 0.87 | 0.87 | 0.94 | 0.94 | 0.91 | 0.92 | 0.95 | 1 | |||||
| BSBM19A1 | 0.9 | -0.87 | -0.85 | 0.81 | 0.9 | 0.89 | 0.91 | 0.86 | 0.9 | 0.62 | 0.61 | 0.72 | 0.69 | 0.58 | 0.55 | 0.57 | 0.52 | 0.51 | 0.54 | 1 | ||||
| BSBM19B1 | -0.84 | 0.79 | 0.77 | -0.76 | -0.85 | -0.84 | -0.86 | -0.81 | -0.85 | -0.61 | -0.62 | -0.69 | -0.67 | -0.59 | -0.57 | -0.58 | -0.52 | -0.52 | -0.55 | -0.94 | 1 | |||
| BSBM19C1 | -0.91 | 0.86 | 0.84 | -0.83 | -0.91 | -0.91 | -0.92 | -0.88 | -0.92 | -0.63 | -0.63 | -0.74 | -0.72 | -0.61 | -0.58 | -0.6 | -0.54 | -0.54 | -0.57 | -0.96 | 0.95 | 1 | ||
| BSBM19D1 | 0.9 | -0.88 | -0.86 | 0.82 | 0.91 | 0.9 | 0.91 | 0.87 | 0.9 | 0.62 | 0.62 | 0.73 | 0.7 | 0.59 | 0.57 | 0.59 | 0.53 | 0.53 | 0.55 | 0.96 | -0.94 | -0.96 | 1 | |
| BSBM19E1 | -0.78 | 0.73 | 0.71 | -0.71 | -0.78 | -0.77 | -0.78 | -0.76 | -0.79 | -0.62 | -0.63 | -0.67 | -0.65 | -0.6 | -0.58 | -0.57 | -0.53 | -0.54 | -0.57 | -0.85 | 0.85 | 0.85 | -0.84 | 1 |
Altogether, it seen that all variables are actually highly correlated, both positively and negatively (>= 0.5). Since we got high correlations, can procees to the next step.
Defining the number of factors:
fa.parallel(nzldfa, fa="both", n.iter=100)
## Parallel analysis suggests that the number of factors = 4 and the number of components = 3
R suggests to remain 4 factors in the analyis.
Let’s try to rotate the diagrams to get the best result.
No rotation:
fa(nzldfa, nfactors = 4, rotate = "none", fm = "ml")
## Factor Analysis using method = ml
## Call: fa(r = nzldfa, nfactors = 4, rotate = "none", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML2 ML3 ML4 h2 u2 com
## BSBM17A1 0.82 -0.26 0.16 0.01 0.77 0.23 1.3
## BSBM17B1 -0.62 0.24 0.00 0.32 0.54 0.46 1.8
## BSBM17C1 -0.63 0.22 -0.08 0.37 0.59 0.41 1.9
## BSBM17D1 0.70 -0.02 0.21 0.04 0.53 0.47 1.2
## BSBM17E1 0.84 -0.31 0.16 0.02 0.82 0.18 1.3
## BSBM17F1 0.72 -0.31 0.13 0.11 0.65 0.35 1.5
## BSBM17G1 0.76 -0.33 0.10 0.13 0.71 0.29 1.5
## BSBM17H1 0.79 -0.17 0.21 -0.03 0.69 0.31 1.2
## BSBM17I1 0.77 -0.30 0.09 -0.04 0.69 0.31 1.3
## BSBM18A1 0.53 0.34 -0.07 0.08 0.40 0.60 1.8
## BSBM18B1 0.64 0.50 -0.15 0.01 0.68 0.32 2.0
## BSBM18C1 0.73 0.29 0.09 -0.05 0.62 0.38 1.4
## BSBM18D1 0.71 0.32 0.11 -0.05 0.62 0.38 1.5
## BSBM18E1 0.65 0.54 -0.12 -0.01 0.72 0.28 2.0
## BSBM18F1 0.63 0.55 -0.12 -0.01 0.71 0.29 2.0
## BSBM18G1 0.57 0.41 0.00 0.04 0.50 0.50 1.8
## BSBM18H1 0.58 0.48 0.02 -0.01 0.58 0.42 1.9
## BSBM18I1 0.57 0.51 -0.05 0.02 0.60 0.40 2.0
## BSBM18J1 0.58 0.49 -0.07 -0.01 0.58 0.42 2.0
## BSBM19A1 0.60 -0.27 -0.32 0.24 0.59 0.41 2.4
## BSBM19B1 -0.36 0.34 0.58 0.10 0.59 0.41 2.4
## BSBM19C1 -0.50 0.40 0.50 0.09 0.67 0.33 3.0
## BSBM19D1 0.62 -0.26 -0.30 0.25 0.60 0.40 2.2
## BSBM19E1 -0.26 0.22 0.41 0.15 0.30 0.70 2.6
##
## ML1 ML2 ML3 ML4
## SS loadings 10.00 3.09 1.21 0.45
## Proportion Var 0.42 0.13 0.05 0.02
## Cumulative Var 0.42 0.55 0.60 0.61
## Proportion Explained 0.68 0.21 0.08 0.03
## Cumulative Proportion 0.68 0.89 0.97 1.00
##
## Mean item complexity = 1.8
## Test of the hypothesis that 4 factors are sufficient.
##
## The degrees of freedom for the null model are 276 and the objective function was 16.69 with Chi Square of 121470.5
## The degrees of freedom for the model are 186 and the objective function was 0.74
##
## The root mean square of the residuals (RMSR) is 0.02
## The df corrected root mean square of the residuals is 0.02
##
## The harmonic number of observations is 7290 with the empirical chi square 1526.99 with prob < 4e-209
## The total number of observations was 7290 with Likelihood Chi Square = 5395.06 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.936
## RMSEA index = 0.062 and the 90 % confidence intervals are 0.061 0.063
## BIC = 3740.72
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## ML1 ML2 ML3 ML4
## Correlation of (regression) scores with factors 0.98 0.95 0.87 0.72
## Multiple R square of scores with factors 0.97 0.90 0.76 0.52
## Minimum correlation of possible factor scores 0.94 0.80 0.52 0.03
fa.diagram(fa(nzldfa, nfactors = 4, rotate = "none", fm = "ml"))
If we look at indexes, we may see that proportion variance is above 0.1 just for 2 variables, while proportion explained differs a lot among variables - not okay. One factor took the majority of the variables. Then, RMSR is 0.02 - not bad, quite close to 0; RMSEA is 0.062 - not bad but nor good, less than 0.05 is excellent; Tucker Lewis Index is 0.936 which is perfect.
Varimax:
fa(nzldfa, nfactors = 4, rotate = "varimax", fm = "ml")
## Factor Analysis using method = ml
## Call: fa(r = nzldfa, nfactors = 4, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML1 ML3 ML4 h2 u2 com
## BSBM17A1 0.27 0.80 0.21 -0.01 0.77 0.23 1.4
## BSBM17B1 -0.18 -0.56 -0.31 0.31 0.54 0.46 2.5
## BSBM17C1 -0.19 -0.59 -0.25 0.38 0.59 0.41 2.4
## BSBM17D1 0.38 0.62 0.03 0.00 0.53 0.47 1.7
## BSBM17E1 0.24 0.84 0.23 0.00 0.82 0.18 1.3
## BSBM17F1 0.18 0.75 0.21 0.10 0.65 0.35 1.3
## BSBM17G1 0.19 0.77 0.25 0.12 0.71 0.29 1.4
## BSBM17H1 0.32 0.76 0.12 -0.06 0.69 0.31 1.4
## BSBM17I1 0.22 0.75 0.27 -0.05 0.69 0.31 1.5
## BSBM18A1 0.59 0.19 0.09 0.07 0.40 0.60 1.3
## BSBM18B1 0.80 0.15 0.15 0.01 0.68 0.32 1.1
## BSBM18C1 0.66 0.43 0.04 -0.08 0.62 0.38 1.8
## BSBM18D1 0.67 0.41 0.01 -0.08 0.62 0.38 1.7
## BSBM18E1 0.83 0.15 0.11 -0.02 0.72 0.28 1.1
## BSBM18F1 0.83 0.13 0.10 -0.01 0.71 0.29 1.1
## BSBM18G1 0.67 0.22 0.02 0.02 0.50 0.50 1.2
## BSBM18H1 0.73 0.20 -0.02 -0.03 0.58 0.42 1.2
## BSBM18I1 0.76 0.14 0.03 0.00 0.60 0.40 1.1
## BSBM18J1 0.74 0.15 0.06 -0.02 0.58 0.42 1.1
## BSBM19A1 0.19 0.44 0.53 0.28 0.59 0.41 2.8
## BSBM19B1 -0.02 -0.16 -0.75 0.02 0.59 0.41 1.1
## BSBM19C1 -0.05 -0.33 -0.74 0.02 0.67 0.33 1.4
## BSBM19D1 0.21 0.46 0.51 0.29 0.60 0.40 3.0
## BSBM19E1 -0.03 -0.10 -0.53 0.09 0.30 0.70 1.1
##
## ML2 ML1 ML3 ML4
## SS loadings 6.00 5.83 2.46 0.46
## Proportion Var 0.25 0.24 0.10 0.02
## Cumulative Var 0.25 0.49 0.60 0.61
## Proportion Explained 0.41 0.40 0.17 0.03
## Cumulative Proportion 0.41 0.80 0.97 1.00
##
## Mean item complexity = 1.5
## Test of the hypothesis that 4 factors are sufficient.
##
## The degrees of freedom for the null model are 276 and the objective function was 16.69 with Chi Square of 121470.5
## The degrees of freedom for the model are 186 and the objective function was 0.74
##
## The root mean square of the residuals (RMSR) is 0.02
## The df corrected root mean square of the residuals is 0.02
##
## The harmonic number of observations is 7290 with the empirical chi square 1526.99 with prob < 4e-209
## The total number of observations was 7290 with Likelihood Chi Square = 5395.06 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.936
## RMSEA index = 0.062 and the 90 % confidence intervals are 0.061 0.063
## BIC = 3740.72
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## ML2 ML1 ML3 ML4
## Correlation of (regression) scores with factors 0.96 0.95 0.89 0.72
## Multiple R square of scores with factors 0.92 0.91 0.79 0.52
## Minimum correlation of possible factor scores 0.84 0.81 0.59 0.04
fa.diagram(fa(nzldfa, nfactors = 4, rotate = "varimax", fm = "ml"))
RMSR, RMSEA and Tucker Lewis Index remained the same. Proportion variance is now above .1 for three variables, proportion explained still differs quite a lot.
Altogether, rather good result; variables seem to be assigned to factors logically (all 17, 18, 19 to different factors correspondingly).
Oblimin:
fa(nzldfa, nfactors = 4, rotate = "oblimin", fm = "ml")
## Loading required namespace: GPArotation
## Factor Analysis using method = ml
## Call: fa(r = nzldfa, nfactors = 4, rotate = "oblimin", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML1 ML3 ML4 h2 u2 com
## BSBM17A1 0.04 0.84 -0.02 -0.04 0.77 0.23 1.0
## BSBM17B1 -0.05 -0.43 0.26 0.34 0.54 0.46 2.6
## BSBM17C1 -0.04 -0.47 0.20 0.41 0.59 0.41 2.4
## BSBM17D1 0.21 0.65 0.14 -0.02 0.53 0.47 1.3
## BSBM17E1 0.00 0.89 -0.03 -0.03 0.82 0.18 1.0
## BSBM17F1 -0.04 0.83 -0.01 0.07 0.65 0.35 1.0
## BSBM17G1 -0.05 0.85 -0.04 0.09 0.71 0.29 1.0
## BSBM17H1 0.11 0.78 0.06 -0.09 0.69 0.31 1.1
## BSBM17I1 0.00 0.76 -0.11 -0.08 0.69 0.31 1.1
## BSBM18A1 0.59 0.08 -0.01 0.08 0.40 0.60 1.1
## BSBM18B1 0.84 -0.06 -0.09 0.03 0.68 0.32 1.0
## BSBM18C1 0.58 0.32 0.07 -0.08 0.62 0.38 1.6
## BSBM18D1 0.60 0.30 0.10 -0.08 0.62 0.38 1.6
## BSBM18E1 0.87 -0.06 -0.05 0.00 0.72 0.28 1.0
## BSBM18F1 0.88 -0.08 -0.05 0.00 0.71 0.29 1.0
## BSBM18G1 0.67 0.10 0.06 0.03 0.50 0.50 1.1
## BSBM18H1 0.74 0.05 0.09 -0.02 0.58 0.42 1.0
## BSBM18I1 0.79 -0.03 0.03 0.02 0.60 0.40 1.0
## BSBM18J1 0.77 -0.03 -0.01 -0.01 0.58 0.42 1.0
## BSBM19A1 0.09 0.40 -0.41 0.28 0.59 0.41 2.9
## BSBM19B1 -0.02 0.07 0.80 0.01 0.59 0.41 1.0
## BSBM19C1 0.01 -0.14 0.74 0.02 0.67 0.33 1.1
## BSBM19D1 0.10 0.43 -0.38 0.29 0.60 0.40 2.9
## BSBM19E1 -0.04 0.09 0.58 0.09 0.30 0.70 1.1
##
## ML2 ML1 ML3 ML4
## SS loadings 5.88 6.07 2.27 0.53
## Proportion Var 0.25 0.25 0.09 0.02
## Cumulative Var 0.25 0.50 0.59 0.61
## Proportion Explained 0.40 0.41 0.15 0.04
## Cumulative Proportion 0.40 0.81 0.96 1.00
##
## With factor correlations of
## ML2 ML1 ML3 ML4
## ML2 1.00 0.50 -0.16 -0.10
## ML1 0.50 1.00 -0.48 -0.15
## ML3 -0.16 -0.48 1.00 -0.06
## ML4 -0.10 -0.15 -0.06 1.00
##
## Mean item complexity = 1.4
## Test of the hypothesis that 4 factors are sufficient.
##
## The degrees of freedom for the null model are 276 and the objective function was 16.69 with Chi Square of 121470.5
## The degrees of freedom for the model are 186 and the objective function was 0.74
##
## The root mean square of the residuals (RMSR) is 0.02
## The df corrected root mean square of the residuals is 0.02
##
## The harmonic number of observations is 7290 with the empirical chi square 1526.99 with prob < 4e-209
## The total number of observations was 7290 with Likelihood Chi Square = 5395.06 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.936
## RMSEA index = 0.062 and the 90 % confidence intervals are 0.061 0.063
## BIC = 3740.72
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## ML2 ML1 ML3 ML4
## Correlation of (regression) scores with factors 0.97 0.98 0.92 0.75
## Multiple R square of scores with factors 0.94 0.95 0.84 0.56
## Minimum correlation of possible factor scores 0.88 0.90 0.68 0.11
fa.diagram(fa(nzldfa, nfactors = 4, rotate = "oblimin", fm = "ml"))
RMSR, RMSEA and Tucker Lewis Index are again the same. However, the results of proportions became worse - proportion var is above .1 just for two factors, proportion explained almost the seame but in the previous case the values seem to differ less.
So, evidently just three factors should be remained. However, the indexes for oblimin and varimax rotation are quite similar, thus we need to look at them one more time, but with three factors left.
fa(nzldfa, nfactors = 3, rotate = "varimax", fm = "ml")
## Factor Analysis using method = ml
## Call: fa(r = nzldfa, nfactors = 3, rotate = "varimax", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML2 ML1 ML3 h2 u2 com
## BSBM17A1 0.26 0.81 0.20 0.77 0.23 1.3
## BSBM17B1 -0.19 -0.56 -0.28 0.43 0.57 1.7
## BSBM17C1 -0.20 -0.59 -0.21 0.44 0.56 1.5
## BSBM17D1 0.38 0.62 0.03 0.53 0.47 1.7
## BSBM17E1 0.24 0.85 0.23 0.82 0.18 1.3
## BSBM17F1 0.17 0.74 0.22 0.63 0.37 1.3
## BSBM17G1 0.18 0.77 0.26 0.69 0.31 1.3
## BSBM17H1 0.31 0.76 0.12 0.69 0.31 1.4
## BSBM17I1 0.21 0.76 0.27 0.69 0.31 1.4
## BSBM18A1 0.59 0.20 0.10 0.40 0.60 1.3
## BSBM18B1 0.79 0.16 0.15 0.68 0.32 1.2
## BSBM18C1 0.65 0.44 0.04 0.62 0.38 1.8
## BSBM18D1 0.67 0.41 0.01 0.62 0.38 1.7
## BSBM18E1 0.82 0.16 0.11 0.72 0.28 1.1
## BSBM18F1 0.83 0.14 0.10 0.72 0.28 1.1
## BSBM18G1 0.67 0.22 0.03 0.50 0.50 1.2
## BSBM18H1 0.73 0.20 -0.01 0.58 0.42 1.2
## BSBM18I1 0.76 0.15 0.04 0.60 0.40 1.1
## BSBM18J1 0.74 0.16 0.06 0.58 0.42 1.1
## BSBM19A1 0.18 0.44 0.53 0.51 0.49 2.2
## BSBM19B1 -0.01 -0.16 -0.75 0.59 0.41 1.1
## BSBM19C1 -0.04 -0.33 -0.75 0.67 0.33 1.4
## BSBM19D1 0.20 0.46 0.51 0.51 0.49 2.3
## BSBM19E1 -0.03 -0.11 -0.52 0.29 0.71 1.1
##
## ML2 ML1 ML3
## SS loadings 5.94 5.89 2.43
## Proportion Var 0.25 0.25 0.10
## Cumulative Var 0.25 0.49 0.59
## Proportion Explained 0.42 0.41 0.17
## Cumulative Proportion 0.42 0.83 1.00
##
## Mean item complexity = 1.4
## Test of the hypothesis that 3 factors are sufficient.
##
## The degrees of freedom for the null model are 276 and the objective function was 16.69 with Chi Square of 121470.5
## The degrees of freedom for the model are 207 and the objective function was 0.98
##
## The root mean square of the residuals (RMSR) is 0.03
## The df corrected root mean square of the residuals is 0.03
##
## The harmonic number of observations is 7290 with the empirical chi square 2692.81 with prob < 0
## The total number of observations was 7290 with Likelihood Chi Square = 7130.44 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.924
## RMSEA index = 0.068 and the 90 % confidence intervals are 0.066 0.069
## BIC = 5289.33
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## ML2 ML1 ML3
## Correlation of (regression) scores with factors 0.96 0.95 0.89
## Multiple R square of scores with factors 0.92 0.91 0.79
## Minimum correlation of possible factor scores 0.84 0.81 0.58
fa(nzldfa, nfactors = 3, rotate = "oblimin", fm = "ml")
## Factor Analysis using method = ml
## Call: fa(r = nzldfa, nfactors = 3, rotate = "oblimin", fm = "ml")
## Standardized loadings (pattern matrix) based upon correlation matrix
## ML1 ML2 ML3 h2 u2 com
## BSBM17A1 0.86 0.02 -0.01 0.77 0.23 1.0
## BSBM17B1 -0.55 -0.03 0.16 0.43 0.57 1.2
## BSBM17C1 -0.61 -0.02 0.08 0.44 0.56 1.0
## BSBM17D1 0.66 0.20 0.13 0.53 0.47 1.3
## BSBM17E1 0.91 -0.02 -0.03 0.82 0.18 1.0
## BSBM17F1 0.80 -0.06 -0.04 0.63 0.37 1.0
## BSBM17G1 0.81 -0.06 -0.08 0.69 0.31 1.0
## BSBM17H1 0.82 0.08 0.07 0.69 0.31 1.0
## BSBM17I1 0.79 -0.02 -0.10 0.69 0.31 1.0
## BSBM18A1 0.06 0.59 -0.04 0.40 0.60 1.0
## BSBM18B1 -0.06 0.84 -0.10 0.68 0.32 1.0
## BSBM18C1 0.35 0.57 0.09 0.62 0.38 1.7
## BSBM18D1 0.33 0.59 0.11 0.62 0.38 1.7
## BSBM18E1 -0.05 0.87 -0.06 0.72 0.28 1.0
## BSBM18F1 -0.08 0.88 -0.05 0.72 0.28 1.0
## BSBM18G1 0.09 0.66 0.05 0.50 0.50 1.0
## BSBM18H1 0.07 0.73 0.09 0.58 0.42 1.0
## BSBM18I1 -0.03 0.79 0.02 0.60 0.40 1.0
## BSBM18J1 -0.02 0.77 0.00 0.58 0.42 1.0
## BSBM19A1 0.32 0.09 -0.47 0.51 0.49 1.9
## BSBM19B1 0.05 -0.03 0.79 0.59 0.41 1.0
## BSBM19C1 -0.16 0.00 0.73 0.67 0.33 1.1
## BSBM19D1 0.34 0.11 -0.44 0.51 0.49 2.0
## BSBM19E1 0.04 -0.04 0.55 0.29 0.71 1.0
##
## ML1 ML2 ML3
## SS loadings 6.26 5.78 2.22
## Proportion Var 0.26 0.24 0.09
## Cumulative Var 0.26 0.50 0.59
## Proportion Explained 0.44 0.40 0.16
## Cumulative Proportion 0.44 0.84 1.00
##
## With factor correlations of
## ML1 ML2 ML3
## ML1 1.00 0.51 -0.45
## ML2 0.51 1.00 -0.13
## ML3 -0.45 -0.13 1.00
##
## Mean item complexity = 1.2
## Test of the hypothesis that 3 factors are sufficient.
##
## The degrees of freedom for the null model are 276 and the objective function was 16.69 with Chi Square of 121470.5
## The degrees of freedom for the model are 207 and the objective function was 0.98
##
## The root mean square of the residuals (RMSR) is 0.03
## The df corrected root mean square of the residuals is 0.03
##
## The harmonic number of observations is 7290 with the empirical chi square 2692.81 with prob < 0
## The total number of observations was 7290 with Likelihood Chi Square = 7130.44 with prob < 0
##
## Tucker Lewis Index of factoring reliability = 0.924
## RMSEA index = 0.068 and the 90 % confidence intervals are 0.066 0.069
## BIC = 5289.33
## Fit based upon off diagonal values = 1
## Measures of factor score adequacy
## ML1 ML2 ML3
## Correlation of (regression) scores with factors 0.98 0.97 0.91
## Multiple R square of scores with factors 0.95 0.94 0.83
## Minimum correlation of possible factor scores 0.90 0.88 0.67
So, the indexes of Tucker Lewis, RMSR and RMSEA are the same, and we look at proprotions. Finally, within varimax rotation all factors’ proportion variances are above .10, and apparently, as noted above, the explained proportions differ less in this case.
Altogether, varimax rotation seem to be the best solution.
And here’s the diagram:
fa.diagram(fa(nzldfa, nfactors = 3, rotate = "varimax", fm = "ml"))
In accordance with the related variables, the factors’ names were assigned as follows:
| Factor | Assigned name |
|---|---|
| M1 | Attitude to and interest in math |
| M2 | Interaction with a teacher |
| M3 | Personal abilities in math |
ML1 <- as.data.frame(nzldfa[c("BSBM17A1", "BSBM17B1", "BSBM17C1", "BSBM17D1", "BSBM17E1", "BSBM17F1", "BSBM17G1", "BSBM17H1", "BSBM17I1")])
psych::alpha(ML1, check.keys = TRUE)
## Warning in psych::alpha(ML1, check.keys = TRUE): Some items were negatively correlated with total scale and were automatically reversed.
## This is indicated by a negative sign for the variable name.
##
## Reliability analysis
## Call: psych::alpha(x = ML1, check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.93 0.93 0.94 0.61 14 0.0011 2.6 0.79 0.6
##
## lower alpha upper 95% confidence boundaries
## 0.93 0.93 0.94
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## BSBM17A1 0.92 0.92 0.92 0.60 12 0.0014 0.0096 0.58
## BSBM17B1- 0.93 0.93 0.93 0.64 14 0.0012 0.0091 0.66
## BSBM17C1- 0.93 0.93 0.93 0.63 14 0.0012 0.0101 0.66
## BSBM17D1 0.93 0.93 0.93 0.63 14 0.0012 0.0093 0.66
## BSBM17E1 0.92 0.92 0.92 0.59 12 0.0014 0.0084 0.57
## BSBM17F1 0.93 0.93 0.93 0.61 13 0.0013 0.0102 0.59
## BSBM17G1 0.92 0.93 0.92 0.61 12 0.0013 0.0101 0.59
## BSBM17H1 0.92 0.92 0.92 0.61 12 0.0013 0.0105 0.59
## BSBM17I1 0.92 0.92 0.92 0.61 12 0.0013 0.0102 0.59
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## BSBM17A1 7290 0.87 0.88 0.87 0.84 2.8 0.94
## BSBM17B1- 7290 0.72 0.72 0.67 0.64 2.8 1.04
## BSBM17C1- 7290 0.73 0.73 0.68 0.66 2.4 0.96
## BSBM17D1 7290 0.72 0.73 0.68 0.65 2.9 0.88
## BSBM17E1 7290 0.90 0.90 0.90 0.87 2.7 0.97
## BSBM17F1 7290 0.81 0.81 0.79 0.76 2.4 0.90
## BSBM17G1 7290 0.84 0.84 0.82 0.79 2.6 0.98
## BSBM17H1 7290 0.84 0.84 0.82 0.80 2.3 0.96
## BSBM17I1 7290 0.85 0.84 0.83 0.80 2.2 1.10
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## BSBM17A1 0.12 0.21 0.42 0.25 0
## BSBM17B1 0.34 0.29 0.24 0.13 0
## BSBM17C1 0.16 0.29 0.37 0.18 0
## BSBM17D1 0.07 0.22 0.44 0.27 0
## BSBM17E1 0.14 0.24 0.39 0.23 0
## BSBM17F1 0.17 0.41 0.31 0.12 0
## BSBM17G1 0.16 0.30 0.35 0.19 0
## BSBM17H1 0.23 0.36 0.28 0.13 0
## BSBM17I1 0.33 0.28 0.21 0.18 0
Alpha is 0.93 which is a very good result. The responses inside the scale are consistent, so the scale may be used further
ML2 <- as.data.frame(nzldfa[c("BSBM18A1", "BSBM18B1", "BSBM18C1", "BSBM18D1", "BSBM18E1", "BSBM18F1", "BSBM18G1", "BSBM18H1", "BSBM18I1", "BSBM18J1")])
psych::alpha(ML2)
##
## Reliability analysis
## Call: psych::alpha(x = ML2)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.93 0.93 0.93 0.58 14 0.0012 3 0.7 0.58
##
## lower alpha upper 95% confidence boundaries
## 0.93 0.93 0.94
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## BSBM18A1 0.93 0.93 0.93 0.60 14 0.0012 0.0037 0.60
## BSBM18B1 0.92 0.92 0.92 0.57 12 0.0013 0.0052 0.57
## BSBM18C1 0.93 0.93 0.92 0.58 13 0.0013 0.0055 0.58
## BSBM18D1 0.93 0.93 0.92 0.58 12 0.0013 0.0056 0.58
## BSBM18E1 0.92 0.92 0.92 0.57 12 0.0013 0.0047 0.57
## BSBM18F1 0.92 0.92 0.92 0.57 12 0.0013 0.0045 0.57
## BSBM18G1 0.93 0.93 0.93 0.59 13 0.0012 0.0059 0.59
## BSBM18H1 0.93 0.93 0.93 0.58 12 0.0013 0.0059 0.57
## BSBM18I1 0.93 0.92 0.92 0.58 12 0.0013 0.0057 0.57
## BSBM18J1 0.93 0.93 0.93 0.58 12 0.0013 0.0059 0.58
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## BSBM18A1 7290 0.67 0.68 0.63 0.61 3.3 0.71
## BSBM18B1 7290 0.83 0.83 0.81 0.78 3.1 0.92
## BSBM18C1 7290 0.78 0.78 0.75 0.72 2.9 0.90
## BSBM18D1 7290 0.79 0.79 0.76 0.73 2.7 0.93
## BSBM18E1 7290 0.85 0.84 0.83 0.80 3.0 0.93
## BSBM18F1 7290 0.84 0.84 0.82 0.79 3.1 0.92
## BSBM18G1 7290 0.75 0.75 0.71 0.69 2.9 0.90
## BSBM18H1 7290 0.79 0.79 0.76 0.73 3.1 0.89
## BSBM18I1 7290 0.80 0.80 0.78 0.75 3.2 0.88
## BSBM18J1 7290 0.79 0.79 0.76 0.74 3.1 0.91
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## BSBM18A1 0.02 0.09 0.43 0.47 0
## BSBM18B1 0.07 0.18 0.37 0.38 0
## BSBM18C1 0.08 0.25 0.41 0.26 0
## BSBM18D1 0.11 0.29 0.38 0.22 0
## BSBM18E1 0.08 0.20 0.37 0.35 0
## BSBM18F1 0.07 0.16 0.34 0.43 0
## BSBM18G1 0.08 0.24 0.41 0.26 0
## BSBM18H1 0.06 0.16 0.38 0.39 0
## BSBM18I1 0.06 0.15 0.37 0.42 0
## BSBM18J1 0.07 0.15 0.37 0.41 0
The resulting number is again 0.93, which is also good. This scale may also be remained.
ML3 <- as.data.frame(nzldfa[c("BSBM19A1", "BSBM19B1", "BSBM19C1", "BSBM19D1", "BSBM19E1")])
psych::alpha(ML3, check.keys = TRUE)
## Warning in psych::alpha(ML3, check.keys = TRUE): Some items were negatively correlated with total scale and were automatically reversed.
## This is indicated by a negative sign for the variable name.
##
## Reliability analysis
## Call: psych::alpha(x = ML3, check.keys = TRUE)
##
## raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
## 0.81 0.82 0.8 0.47 4.4 0.0034 2.3 0.72 0.45
##
## lower alpha upper 95% confidence boundaries
## 0.81 0.81 0.82
##
## Reliability if an item is dropped:
## raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## BSBM19A1- 0.77 0.77 0.73 0.46 3.4 0.0043 0.0140 0.44
## BSBM19B1 0.76 0.77 0.74 0.45 3.3 0.0045 0.0211 0.47
## BSBM19C1 0.74 0.75 0.72 0.43 3.0 0.0050 0.0180 0.44
## BSBM19D1- 0.78 0.78 0.74 0.47 3.5 0.0041 0.0126 0.45
## BSBM19E1 0.82 0.82 0.80 0.54 4.7 0.0034 0.0076 0.53
##
## Item statistics
## n raw.r std.r r.cor r.drop mean sd
## BSBM19A1- 7290 0.76 0.78 0.71 0.63 2.0 0.83
## BSBM19B1 7290 0.79 0.79 0.72 0.65 2.3 0.96
## BSBM19C1 7290 0.84 0.82 0.78 0.70 2.5 1.08
## BSBM19D1- 7290 0.75 0.76 0.70 0.60 2.3 0.89
## BSBM19E1 7290 0.65 0.65 0.49 0.45 2.2 0.96
##
## Non missing response frequency for each item
## 1 2 3 4 miss
## BSBM19A1 0.06 0.17 0.49 0.28 0
## BSBM19B1 0.22 0.37 0.28 0.13 0
## BSBM19C1 0.21 0.27 0.28 0.24 0
## BSBM19D1 0.09 0.30 0.40 0.20 0
## BSBM19E1 0.28 0.36 0.26 0.11 0
This alpha is just 0.81, but still it is satisfactory, so to say (alpha should be > .7). All scales are internally consistent and may be used in further analysis.
Inserting factors into our dataset:
fa1 <- fa(nzldfa, nfactors = 3, rotate = "varimax", fm = "ml", scores=T)
load <- fa1$loadings[,1:2]
plot(load)
fascores <- as.data.frame(fa1$scores)
nzldfac <- cbind(nzld,fascores)
Checking types of the variables.
sapply(nzldfac, class)
## BSBM17A1 BSBM17B1 BSBM17C1 BSBM17D1 BSBM17E1 BSBM17F1 BSBM17G1 BSBM17H1
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## BSBM17I1 BSBM18A1 BSBM18B1 BSBM18C1 BSBM18D1 BSBM18E1 BSBM18F1 BSBM18G1
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## BSBM18H1 BSBM18I1 BSBM18J1 BSBM19A1 BSBM19B1 BSBM19C1 BSBM19D1 BSBM19E1
## "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric" "numeric"
## BSMMAT01 paredu gender country ML2 ML1 ML3
## "factor" "factor" "factor" "factor" "numeric" "numeric" "numeric"
Math achievement is factor, so let’s change it in advance.
nzldfac$BSMMAT01 <- as.numeric(as.character(nzldfac$BSMMAT01))
Distributions of used variables:
ml1 <- ggplot(nzldfac, aes(ML1)) + geom_histogram(color = "skyblue2", fill = "lightsteelblue1") + theme_pubclean()
ml2 <- ggplot(nzldfac, aes(ML2)) + geom_histogram(color = "skyblue2", fill = "lightsteelblue1") + theme_pubclean()
ml3 <- ggplot(nzldfac, aes(ML3)) + geom_histogram(color = "skyblue2", fill = "lightsteelblue1") + theme_pubclean()
ggarrange(ml1, ml2, ml3, nrow = 1, ncol = 3)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
First of all, let’s start from adding the factors.
model1 <- lm(nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2)
summary(model1)
Call:
lm(formula = nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2)
Residuals:
Min 1Q Median 3Q Max
-265.099 -46.138 7.271 51.981 221.323
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 505.5126 0.8669 583.119 < 2e-16 ***
nzldfac$ML3 45.9883 0.9792 46.966 < 2e-16 ***
nzldfac$ML1 12.8915 0.9148 14.092 < 2e-16 ***
nzldfac$ML2 5.4430 0.9039 6.022 1.81e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 74.02 on 7286 degrees of freedom
Multiple R-squared: 0.2615, Adjusted R-squared: 0.2612
F-statistic: 859.9 on 3 and 7286 DF, p-value: < 2.2e-16
The overall relationship is significant (p < 0.0001), as well as the relations with the predictor variables.
Estimates mean that: (1) the better abilities are associated with the increase in math achievement by almost 46 points, (2) the better attitude to math results increases achievement by 13 points, and (3) the better is the nature of interaction with a teacher the better is the success in math (by 5 points roughly).
R-squared is rather ok, it says that approximately 26% of math achievement variance can be predicted by personal abilities, interaction with a teacher, and own attitude to math.
Many studies suggest that girls outperform boys in studying results. So, let’s check whether there is some influence of gender:
model2 <- lm(nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2 + as.factor(nzldfac$gender))
summary(model2)
Call:
lm(formula = nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2 +
as.factor(nzldfac$gender))
Residuals:
Min 1Q Median 3Q Max
-263.815 -46.395 7.465 51.323 220.247
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 508.4376 1.1869 428.372 < 2e-16 ***
nzldfac$ML3 46.4830 0.9880 47.049 < 2e-16 ***
nzldfac$ML1 13.1780 0.9175 14.363 < 2e-16 ***
nzldfac$ML2 5.6839 0.9056 6.277 3.66e-10 ***
as.factor(nzldfac$gender)Male -6.3766 1.7690 -3.605 0.000315 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 73.96 on 7285 degrees of freedom
Multiple R-squared: 0.2628, Adjusted R-squared: 0.2624
F-statistic: 649.2 on 4 and 7285 DF, p-value: < 2.2e-16
The following result supports the named statement. According to significant p-value (p = .0003), and the estimate being a boy potentially reduces the level of math achievement by 6 points. R-squared remained almost the same, still the same 26% of variance may be explained with this model.
Now, the role of country:
model3 <- lm(nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2 + as.factor(nzldfac$country))
summary(model3)
Call:
lm(formula = nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2 +
as.factor(nzldfac$country))
Residuals:
Min 1Q Median 3Q Max
-263.361 -46.084 7.206 51.803 222.902
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 503.8938 0.9462 532.559 < 2e-16 ***
nzldfac$ML3 45.8449 0.9786 46.846 < 2e-16 ***
nzldfac$ML1 12.6517 0.9155 13.819 < 2e-16 ***
nzldfac$ML2 5.3229 0.9032 5.893 3.96e-09 ***
as.factor(nzldfac$country)No 10.0179 2.3602 4.244 2.22e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 73.93 on 7285 degrees of freedom
Multiple R-squared: 0.2633, Adjusted R-squared: 0.2629
F-statistic: 650.9 on 4 and 7285 DF, p-value: < 2.2e-16
Surprisingly, being born outside of the country increases the achievement in mathematics by 10 points (p < 0.0001) Explained variance. however, is still around 26%.
And parental education:
model4 <- lm(nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML3 + nzldfac$ML3 + as.factor(nzldfac$paredu))
summary(model4)
Call:
lm(formula = nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML3 + nzldfac$ML3 +
as.factor(nzldfac$paredu))
Residuals:
Min 1Q Median 3Q Max
-289.931 -44.885 5.647 49.554 231.944
Coefficients:
Estimate
(Intercept) 538.8343
nzldfac$ML3 44.0363
as.factor(nzldfac$paredu)Post-secondary but not University -17.2546
as.factor(nzldfac$paredu)Upper Secondary -48.6591
as.factor(nzldfac$paredu)Lower Secondary -57.2875
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School -76.7510
as.factor(nzldfac$paredu)Don't Know -42.6433
Std. Error
(Intercept) 1.9308
nzldfac$ML3 0.9701
as.factor(nzldfac$paredu)Post-secondary but not University 3.2612
as.factor(nzldfac$paredu)Upper Secondary 3.2388
as.factor(nzldfac$paredu)Lower Secondary 4.3793
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School 7.7976
as.factor(nzldfac$paredu)Don't Know 2.2672
t value
(Intercept) 279.074
nzldfac$ML3 45.391
as.factor(nzldfac$paredu)Post-secondary but not University -5.291
as.factor(nzldfac$paredu)Upper Secondary -15.024
as.factor(nzldfac$paredu)Lower Secondary -13.081
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School -9.843
as.factor(nzldfac$paredu)Don't Know -18.809
Pr(>|t|)
(Intercept) < 2e-16
nzldfac$ML3 < 2e-16
as.factor(nzldfac$paredu)Post-secondary but not University 1.25e-07
as.factor(nzldfac$paredu)Upper Secondary < 2e-16
as.factor(nzldfac$paredu)Lower Secondary < 2e-16
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School < 2e-16
as.factor(nzldfac$paredu)Don't Know < 2e-16
(Intercept) ***
nzldfac$ML3 ***
as.factor(nzldfac$paredu)Post-secondary but not University ***
as.factor(nzldfac$paredu)Upper Secondary ***
as.factor(nzldfac$paredu)Lower Secondary ***
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School ***
as.factor(nzldfac$paredu)Don't Know ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 72.81 on 7283 degrees of freedom
Multiple R-squared: 0.2857, Adjusted R-squared: 0.2851
F-statistic: 485.5 on 6 and 7283 DF, p-value: < 2.2e-16
As is seen, p-value tells that there is a significant relationship between math achivement and unknown status of parental education. Besides, all the relationships are negative. As was mentioned above, parental education contains too many “Don’t know” answers, thus we cannot remove it because it is significant, but the results should not be trusted completely.
In addition, several research also suggest that teachers are more ‘gentle’ towards girls (i.e. attitude to girls is somewhat better than to boys). Likewise, personal abilities in math may also depend on gender. So, let’s check the following assumptions and add the rest predictors into one model:
nzldfac$BSMMAT01 <- as.numeric(as.character(nzldfac$BSMMAT01))
model5 <- lm(nzldfac$BSMMAT01 ~ nzldfac$ML1 + nzldfac$ML2 * as.factor(nzldfac$gender) + nzldfac$ML3 * as.factor(nzldfac$gender) + as.factor(nzldfac$paredu) + as.factor(nzldfac$country))
summary(model5)
Call:
lm(formula = nzldfac$BSMMAT01 ~ nzldfac$ML1 + nzldfac$ML2 * as.factor(nzldfac$gender) +
nzldfac$ML3 * as.factor(nzldfac$gender) + as.factor(nzldfac$paredu) +
as.factor(nzldfac$country))
Residuals:
Min 1Q Median 3Q Max
-273.500 -44.105 6.302 49.332 229.611
Coefficients:
Estimate
(Intercept) 537.9639
nzldfac$ML1 11.8999
nzldfac$ML2 1.3688
as.factor(nzldfac$gender)Male -5.8192
nzldfac$ML3 39.3535
as.factor(nzldfac$paredu)Post-secondary but not University -16.0919
as.factor(nzldfac$paredu)Upper Secondary -45.3729
as.factor(nzldfac$paredu)Lower Secondary -53.5314
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School -71.0809
as.factor(nzldfac$paredu)Don't Know -40.7669
as.factor(nzldfac$country)No 5.4686
nzldfac$ML2:as.factor(nzldfac$gender)Male 9.5339
as.factor(nzldfac$gender)Male:nzldfac$ML3 9.1231
Std. Error
(Intercept) 2.1104
nzldfac$ML1 0.8915
nzldfac$ML2 1.1480
as.factor(nzldfac$gender)Male 1.7123
nzldfac$ML3 1.3046
as.factor(nzldfac$paredu)Post-secondary but not University 3.2111
as.factor(nzldfac$paredu)Upper Secondary 3.2026
as.factor(nzldfac$paredu)Lower Secondary 4.3179
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School 7.6662
as.factor(nzldfac$paredu)Don't Know 2.2381
as.factor(nzldfac$country)No 2.2991
nzldfac$ML2:as.factor(nzldfac$gender)Male 1.7752
as.factor(nzldfac$gender)Male:nzldfac$ML3 1.9149
t value
(Intercept) 254.909
nzldfac$ML1 13.348
nzldfac$ML2 1.192
as.factor(nzldfac$gender)Male -3.398
nzldfac$ML3 30.164
as.factor(nzldfac$paredu)Post-secondary but not University -5.011
as.factor(nzldfac$paredu)Upper Secondary -14.168
as.factor(nzldfac$paredu)Lower Secondary -12.397
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School -9.272
as.factor(nzldfac$paredu)Don't Know -18.215
as.factor(nzldfac$country)No 2.379
nzldfac$ML2:as.factor(nzldfac$gender)Male 5.371
as.factor(nzldfac$gender)Male:nzldfac$ML3 4.764
Pr(>|t|)
(Intercept) < 2e-16
nzldfac$ML1 < 2e-16
nzldfac$ML2 0.233164
as.factor(nzldfac$gender)Male 0.000681
nzldfac$ML3 < 2e-16
as.factor(nzldfac$paredu)Post-secondary but not University 5.53e-07
as.factor(nzldfac$paredu)Upper Secondary < 2e-16
as.factor(nzldfac$paredu)Lower Secondary < 2e-16
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School < 2e-16
as.factor(nzldfac$paredu)Don't Know < 2e-16
as.factor(nzldfac$country)No 0.017407
nzldfac$ML2:as.factor(nzldfac$gender)Male 8.09e-08
as.factor(nzldfac$gender)Male:nzldfac$ML3 1.93e-06
(Intercept) ***
nzldfac$ML1 ***
nzldfac$ML2
as.factor(nzldfac$gender)Male ***
nzldfac$ML3 ***
as.factor(nzldfac$paredu)Post-secondary but not University ***
as.factor(nzldfac$paredu)Upper Secondary ***
as.factor(nzldfac$paredu)Lower Secondary ***
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School ***
as.factor(nzldfac$paredu)Don't Know ***
as.factor(nzldfac$country)No *
nzldfac$ML2:as.factor(nzldfac$gender)Male ***
as.factor(nzldfac$gender)Male:nzldfac$ML3 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 71.47 on 7277 degrees of freedom
Multiple R-squared: 0.3123, Adjusted R-squared: 0.3112
F-statistic: 275.4 on 12 and 7277 DF, p-value: < 2.2e-16
As can be seen from the resulting table, there’s an interaction in both cases. It means that (1) the level of achievement determined by the interaction with a teacher depends on gender, and (2) achivement level defined by personal abilities depends also on gender. Contrary to expectations, being a boy in the case of interaction with a teacher is associated with increase in math achievement by almost 9 points. However, ML2 seems to be insignificant predictor per se in this model (p = .233). Similarly, males seems to have slightly better abilities in math, so being a boy in the case of personal abilities potentially increases math achievement by 9 points.
Multicollinearity of predictors
vif(model5)
## GVIF Df GVIF^(1/(2*Df))
## nzldfac$ML1 1.027904 1 1.013856
## nzldfac$ML2 1.733306 1 1.316551
## as.factor(nzldfac$gender) 1.038986 1 1.019306
## nzldfac$ML3 1.918162 1 1.384977
## as.factor(nzldfac$paredu) 1.048717 5 1.004768
## as.factor(nzldfac$country) 1.022062 1 1.010971
## nzldfac$ML2:as.factor(nzldfac$gender) 1.727542 1 1.314360
## as.factor(nzldfac$gender):nzldfac$ML3 1.869469 1 1.367285
There is no multicollinearity among the predictors.
Outliers and leverages.
outlierTest(model5)
## No Studentized residuals with Bonferroni p < 0.05
## Largest |rstudent|:
## rstudent unadjusted p-value Bonferroni p
## 5277 -3.8331 0.00012761 0.93024
qqPlot(model5, main = "QQ Plot")
## [1] 3278 5277
The distribution of residuals is not really close to normality, two significant outliers.
Leverages
leveragePlots(model5)
par(mfrow=c(2,2))
plot(model5)
No outliers fall above Cook’s distance, which means that there are no leverages.
Studentized residuals:
library(MASS)
sresid <- studres(model5)
hist(sresid, freq = FALSE,
main = "Distribution of Studentized Residuals")
xfit <- seq(min(sresid), max(sresid), length = 40)
yfit <- dnorm(xfit)
lines(xfit, yfit)
Left-skewed, but almost normal distribution.
Finally, ANOVA testing.
anova(model1, model2)
Analysis of Variance Table
Model 1: nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2
Model 2: nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2 +
as.factor(nzldfac$gender)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 7286 39917806
2 7285 39846735 1 71072 12.994 0.0003146 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The second model (= gender as factor) is better.
anova(model2, model3)
Analysis of Variance Table
Model 1: nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2 +
as.factor(nzldfac$gender)
Model 2: nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2 +
as.factor(nzldfac$country)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 7285 39846735
2 7285 39819335 0 27399
It seems that there’s no significant difference.
anova(model2, model4)
Analysis of Variance Table
Model 1: nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2 +
as.factor(nzldfac$gender)
Model 2: nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML3 + nzldfac$ML3 +
as.factor(nzldfac$paredu)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 7285 39846735
2 7283 38608351 2 1238384 116.8 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The fourth one is obviously better.
anova(model3, model4)
Analysis of Variance Table
Model 1: nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML1 + nzldfac$ML2 +
as.factor(nzldfac$country)
Model 2: nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML3 + nzldfac$ML3 +
as.factor(nzldfac$paredu)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 7285 39819335
2 7283 38608351 2 1210985 114.22 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Again, fourth is better.
And let’s compare with a final:
anova(model4, model5)
Analysis of Variance Table
Model 1: nzldfac$BSMMAT01 ~ nzldfac$ML3 + nzldfac$ML3 + nzldfac$ML3 +
as.factor(nzldfac$paredu)
Model 2: nzldfac$BSMMAT01 ~ nzldfac$ML1 + nzldfac$ML2 * as.factor(nzldfac$gender) +
nzldfac$ML3 * as.factor(nzldfac$gender) + as.factor(nzldfac$paredu) +
as.factor(nzldfac$country)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 7283 38608351
2 7277 37171184 6 1437167 46.892 < 2.2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The final model (i.e. with interactions) is the best.
Let’s look at it again:
summary(model5)
Call:
lm(formula = nzldfac$BSMMAT01 ~ nzldfac$ML1 + nzldfac$ML2 * as.factor(nzldfac$gender) +
nzldfac$ML3 * as.factor(nzldfac$gender) + as.factor(nzldfac$paredu) +
as.factor(nzldfac$country))
Residuals:
Min 1Q Median 3Q Max
-273.500 -44.105 6.302 49.332 229.611
Coefficients:
Estimate
(Intercept) 537.9639
nzldfac$ML1 11.8999
nzldfac$ML2 1.3688
as.factor(nzldfac$gender)Male -5.8192
nzldfac$ML3 39.3535
as.factor(nzldfac$paredu)Post-secondary but not University -16.0919
as.factor(nzldfac$paredu)Upper Secondary -45.3729
as.factor(nzldfac$paredu)Lower Secondary -53.5314
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School -71.0809
as.factor(nzldfac$paredu)Don't Know -40.7669
as.factor(nzldfac$country)No 5.4686
nzldfac$ML2:as.factor(nzldfac$gender)Male 9.5339
as.factor(nzldfac$gender)Male:nzldfac$ML3 9.1231
Std. Error
(Intercept) 2.1104
nzldfac$ML1 0.8915
nzldfac$ML2 1.1480
as.factor(nzldfac$gender)Male 1.7123
nzldfac$ML3 1.3046
as.factor(nzldfac$paredu)Post-secondary but not University 3.2111
as.factor(nzldfac$paredu)Upper Secondary 3.2026
as.factor(nzldfac$paredu)Lower Secondary 4.3179
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School 7.6662
as.factor(nzldfac$paredu)Don't Know 2.2381
as.factor(nzldfac$country)No 2.2991
nzldfac$ML2:as.factor(nzldfac$gender)Male 1.7752
as.factor(nzldfac$gender)Male:nzldfac$ML3 1.9149
t value
(Intercept) 254.909
nzldfac$ML1 13.348
nzldfac$ML2 1.192
as.factor(nzldfac$gender)Male -3.398
nzldfac$ML3 30.164
as.factor(nzldfac$paredu)Post-secondary but not University -5.011
as.factor(nzldfac$paredu)Upper Secondary -14.168
as.factor(nzldfac$paredu)Lower Secondary -12.397
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School -9.272
as.factor(nzldfac$paredu)Don't Know -18.215
as.factor(nzldfac$country)No 2.379
nzldfac$ML2:as.factor(nzldfac$gender)Male 5.371
as.factor(nzldfac$gender)Male:nzldfac$ML3 4.764
Pr(>|t|)
(Intercept) < 2e-16
nzldfac$ML1 < 2e-16
nzldfac$ML2 0.233164
as.factor(nzldfac$gender)Male 0.000681
nzldfac$ML3 < 2e-16
as.factor(nzldfac$paredu)Post-secondary but not University 5.53e-07
as.factor(nzldfac$paredu)Upper Secondary < 2e-16
as.factor(nzldfac$paredu)Lower Secondary < 2e-16
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School < 2e-16
as.factor(nzldfac$paredu)Don't Know < 2e-16
as.factor(nzldfac$country)No 0.017407
nzldfac$ML2:as.factor(nzldfac$gender)Male 8.09e-08
as.factor(nzldfac$gender)Male:nzldfac$ML3 1.93e-06
(Intercept) ***
nzldfac$ML1 ***
nzldfac$ML2
as.factor(nzldfac$gender)Male ***
nzldfac$ML3 ***
as.factor(nzldfac$paredu)Post-secondary but not University ***
as.factor(nzldfac$paredu)Upper Secondary ***
as.factor(nzldfac$paredu)Lower Secondary ***
as.factor(nzldfac$paredu)Some Primary, Lower Secondary or No School ***
as.factor(nzldfac$paredu)Don't Know ***
as.factor(nzldfac$country)No *
nzldfac$ML2:as.factor(nzldfac$gender)Male ***
as.factor(nzldfac$gender)Male:nzldfac$ML3 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 71.47 on 7277 degrees of freedom
Multiple R-squared: 0.3123, Adjusted R-squared: 0.3112
F-statistic: 275.4 on 12 and 7277 DF, p-value: < 2.2e-16
The model, though still have one insignificant predictor (which cannot be removed due to interaction), explains 31% of variance in math achievemetn. It is a rather good result.
In this project we discovered what factors affect math achievement among New Zealander pupils.
Exploratory factor analysis and regression analysis werre conducted to examine these questions.
In factor analysis the varimax rotation turned out to be the best solution. In regression analysis all of the received hidden factors were examined, and the role of gender, parental education, and country of origin were taken onto account. After having been done diagnostics, we discovered that the most suitable model is the final one.
Altogether, all of the factor predictors were discovered to impact math achievement, these are namely - (1) attitude to math, (2) interaction with a teacher, and (3) personal abilities in math. The rest predictor variables also had significant influnce, for instance having being born outside of the country was found to be associated with the slight increase in math achievement. Besides, two interactions were examined (interaction with a teacher and gender; personal abilities in math and gender) and actually were significant.