Load data
dta_math<-read.csv("C:/Users/ASUS/Desktop/data/days_absent.csv", header =T)
head(dta_math)
## ID Gender Program Math Days
## 1 S1001 male Academic 63 4
## 2 S1002 male Academic 27 4
## 3 S1003 female Academic 20 2
## 4 S1004 female Academic 16 3
## 5 S1005 female Academic 2 3
## 6 S1006 female Academic 71 13
str(dta_math)
## 'data.frame': 314 obs. of 5 variables:
## $ ID : chr "S1001" "S1002" "S1003" "S1004" ...
## $ Gender : chr "male" "male" "female" "female" ...
## $ Program: chr "Academic" "Academic" "Academic" "Academic" ...
## $ Math : int 63 27 20 16 2 71 63 3 51 49 ...
## $ Days : int 4 4 2 3 3 13 11 7 10 9 ...
dta_math$Gender<-factor(dta_math$Gender)
dta_math$Program<-factor(dta_math$Program)
dta_math$ID<-factor(dta_math$ID)
m0 specified a glm model with Days as DV, and Math, Gender, Program as IVs. The interaction terms were examined.
The output showed Math and Program had significant main effect on Absence (at least one group
was different from the other groups in terms of Absence), a 2way interaction existed between Math and program was also observed.
sjPlot::tab_model(mo<-glm(Days~ Math*Gender*Program, data= dta_math, family = poisson(link=log)), show.se=T, show.r2=F, show.obs=F)
| Days | ||||
|---|---|---|---|---|
| Predictors | Incidence Rate Ratios | std. Error | CI | p |
| (Intercept) | 10.43 | 0.08 | 8.94 – 12.14 | <0.001 |
| Math | 0.99 | 0.00 | 0.99 – 1.00 | <0.001 |
| Gender [male] | 0.85 | 0.11 | 0.69 – 1.06 | 0.146 |
| Program [General] | 1.47 | 0.13 | 1.13 – 1.91 | 0.004 |
| Program [Vocational] | 0.23 | 0.25 | 0.14 – 0.36 | <0.001 |
| Math * Gender [male] | 1.00 | 0.00 | 0.99 – 1.00 | 0.274 |
| Math * Program [General] | 1.00 | 0.00 | 1.00 – 1.01 | 0.479 |
|
Math * Program [Vocational] |
1.01 | 0.00 | 1.00 – 1.02 | 0.019 |
|
Gender [male] * Program [General] |
0.97 | 0.20 | 0.65 – 1.43 | 0.863 |
|
Gender [male] * Program [Vocational] |
1.59 | 0.37 | 0.76 – 3.27 | 0.214 |
|
(Math * Gender [male]) * Program [General] |
1.00 | 0.00 | 0.99 – 1.01 | 0.692 |
|
(Math * Gender [male]) * Program [Vocational] |
1.00 | 0.01 | 0.98 – 1.01 | 0.634 |
To understand the nature of the 2way interaction, the relationship between Math score and Days
of absent is displayed, facet_grid by different types of program,and color by different gender.
Consistent with the stats output, the only interaction observed was a 2way interaction
between Math score and Program in terms of Absence days. Specifically, in Academic
and General programs, high absence days was found related to lower math score, while such
pattern was not observed in Vocational program (students with different math scores were all low in their absence days.)
library(ggplot2)
ggplot(dta_math,
aes(Math, Days,
color=Gender)) +
stat_smooth(method="glm",
formula=y ~ x,
method.args=list(family=poisson),
size=rel(.5)) +
geom_point(pch=20, alpha=.5) +
facet_grid(. ~ Program) +
labs(y="Days absent",
x="Math score") +
theme_minimal()+
theme(legend.position = c(.95, .9))
```