1 Data Management

Load data

dta_math<-read.csv("C:/Users/ASUS/Desktop/data/days_absent.csv", header =T)

head(dta_math)
##      ID Gender  Program Math Days
## 1 S1001   male Academic   63    4
## 2 S1002   male Academic   27    4
## 3 S1003 female Academic   20    2
## 4 S1004 female Academic   16    3
## 5 S1005 female Academic    2    3
## 6 S1006 female Academic   71   13
str(dta_math)
## 'data.frame':    314 obs. of  5 variables:
##  $ ID     : chr  "S1001" "S1002" "S1003" "S1004" ...
##  $ Gender : chr  "male" "male" "female" "female" ...
##  $ Program: chr  "Academic" "Academic" "Academic" "Academic" ...
##  $ Math   : int  63 27 20 16 2 71 63 3 51 49 ...
##  $ Days   : int  4 4 2 3 3 13 11 7 10 9 ...
dta_math$Gender<-factor(dta_math$Gender)
dta_math$Program<-factor(dta_math$Program)
dta_math$ID<-factor(dta_math$ID)

2 Analysis

2.1 m0

m0 specified a glm model with Days as DV, and Math, Gender, Program as IVs. The interaction terms were examined.
The output showed Math and Program had significant main effect on Absence (at least one group
was different from the other groups in terms of Absence), a 2way interaction existed between Math and program was also observed.

sjPlot::tab_model(mo<-glm(Days~ Math*Gender*Program, data= dta_math, family = poisson(link=log)), show.se=T, show.r2=F, show.obs=F)
  Days
Predictors Incidence Rate Ratios std. Error CI p
(Intercept) 10.43 0.08 8.94 – 12.14 <0.001
Math 0.99 0.00 0.99 – 1.00 <0.001
Gender [male] 0.85 0.11 0.69 – 1.06 0.146
Program [General] 1.47 0.13 1.13 – 1.91 0.004
Program [Vocational] 0.23 0.25 0.14 – 0.36 <0.001
Math * Gender [male] 1.00 0.00 0.99 – 1.00 0.274
Math * Program [General] 1.00 0.00 1.00 – 1.01 0.479
Math * Program
[Vocational]
1.01 0.00 1.00 – 1.02 0.019
Gender [male] * Program
[General]
0.97 0.20 0.65 – 1.43 0.863
Gender [male] * Program
[Vocational]
1.59 0.37 0.76 – 3.27 0.214
(Math * Gender [male]) *
Program [General]
1.00 0.00 0.99 – 1.01 0.692
(Math * Gender [male]) *
Program [Vocational]
1.00 0.01 0.98 – 1.01 0.634

3 Visdualization

To understand the nature of the 2way interaction, the relationship between Math score and Days
of absent is displayed, facet_grid by different types of program,and color by different gender.
Consistent with the stats output, the only interaction observed was a 2way interaction
between Math score and Program in terms of Absence days. Specifically, in Academic
and General programs, high absence days was found related to lower math score, while such
pattern was not observed in Vocational program (students with different math scores were all low in their absence days.)

library(ggplot2)
ggplot(dta_math, 
       aes(Math, Days, 
           color=Gender)) +
 stat_smooth(method="glm",
             formula=y ~ x,
             method.args=list(family=poisson),
             size=rel(.5)) +
 geom_point(pch=20, alpha=.5) +
 facet_grid(. ~ Program) +
 labs(y="Days absent", 
      x="Math score") +
 theme_minimal()+
 theme(legend.position = c(.95, .9))

```