1 Description

The data are obtained from two surveys of children’s mental health. The data set consists of 1,428 children who had both parent and teacher responses, and an additional 1,073 children with only a parent response. In the psychiatric literature, these sources of respondents are often referred to as “informants”.

A standardized measure of childhood psychopathology was used both by parents (Child Behavior Checklist, CBC) and teachers (Teacher Report Forms, TRF) to assess children in the study. The response variable is derived from the externalizing scale, which assesses delinquent and aggressive behavior. The scale has been dichotomized at the cutpoint for borderline/clinical psychopathology.

Column 1: Child ID
Column 2: Child’s physical health problems: Fair = Fair to bad , Good
Column 3: Single parent status: Single, Otherwise
Column 4: Externalizing behavior (Parent report)
Column 5: Externalizing behavior (Teachert report)

2 Loading package

require(pacman)
## Loading required package: pacman
pacman::p_load(tidyr, tidyverse, nlme, car, afex, emmeans, lattice, faraway, boot, ggplot2, kableExtra, MASS, pscl, nnet, mgcv, VGAM, geepack)

3 Data Management

3.1 input data

dta1 <- read.table("C:/Users/HANK/Desktop/HOMEWORK/ccs-data.txt", h=F, na.strings='.')

3.2 rename

colnames(dta1) <- c("ID", 
                    "Phyhealth", 
                    "ParentSingle", 
                    "Parent",
                    "Teacher")
dta1 <- dta1 %>% 
  mutate(Phyhealth = factor(ifelse(Phyhealth == 1, "Bad", "Good")),
         ParentSingle = factor(ifelse(ParentSingle == 1, "Y", "N")),
         Parent = as.numeric(Parent),
         Teacher = as.numeric(Teacher))

head(dta1)
##   ID Phyhealth ParentSingle Parent Teacher
## 1  1       Bad            Y      0      NA
## 2  2      Good            N      1      NA
## 3  3       Bad            N      0      NA
## 4  4       Bad            Y      1      NA
## 5  5       Bad            N      0      NA
## 6  6       Bad            N      0      NA
head(dta1)
##   ID Phyhealth ParentSingle Parent Teacher
## 1  1       Bad            Y      0      NA
## 2  2      Good            N      1      NA
## 3  3       Bad            N      0      NA
## 4  4       Bad            Y      1      NA
## 5  5       Bad            N      0      NA
## 6  6       Bad            N      0      NA
str(dta1)
## 'data.frame':    2501 obs. of  5 variables:
##  $ ID          : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Phyhealth   : Factor w/ 2 levels "Bad","Good": 1 2 1 1 1 1 1 2 2 2 ...
##  $ ParentSingle: Factor w/ 2 levels "N","Y": 2 1 1 2 1 1 1 1 2 1 ...
##  $ Parent      : num  0 1 0 1 0 0 0 0 0 0 ...
##  $ Teacher     : num  NA NA NA NA NA NA NA 0 NA 0 ...

3.3 Wide to Long

dta1L <- gather(dta1, key=Informant, value=Report,
                Parent:Teacher, factor_key=TRUE)

head(dta1L)
##   ID Phyhealth ParentSingle Informant Report
## 1  1       Bad            Y    Parent      0
## 2  2      Good            N    Parent      1
## 3  3       Bad            N    Parent      0
## 4  4       Bad            Y    Parent      1
## 5  5       Bad            N    Parent      0
## 6  6       Bad            N    Parent      0
str(dta1L)
## 'data.frame':    5002 obs. of  5 variables:
##  $ ID          : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Phyhealth   : Factor w/ 2 levels "Bad","Good": 1 2 1 1 1 1 1 2 2 2 ...
##  $ ParentSingle: Factor w/ 2 levels "N","Y": 2 1 1 2 1 1 1 1 2 1 ...
##  $ Informant   : Factor w/ 2 levels "Parent","Teacher": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Report      : num  0 1 0 1 0 0 0 0 0 0 ...

4 Descriptive info

with(dta1, ftable(Parent, Teacher))
##        Teacher    0    1
## Parent                  
## 0              1030  165
## 1               129  104

老師更有可能報告學生違規和攻擊行為。

5 correlation

cor(dta1[,4:5], use="pair")
##            Parent   Teacher
## Parent  1.0000000 0.2913296
## Teacher 0.2913296 1.0000000

teacher & parent的相關為0.2913 屬於低度正相關

6 proportion of missing responses

sum(is.na(dta1[,4]) | is.na(dta1[,5]))/dim(dta1)[1]
## [1] 0.4290284

回應資料遺失的百分比為42.90%(比例不算低)

7 Parameter estimates

sjPlot::tab_model(m1 <- geeglm(Report ~ Informant + Phyhealth + ParentSingle + Informant:Phyhealth, data=dta1L, id = ID, family = binomial, corstr = "exchangeable"), show.obs=F, show.ngroups = F)
  Report
Predictors Odds Ratios CI p
(Intercept) 0.21 0.18 – 0.25 <0.001
Informant Teacher 1.05 0.83 – 1.33 0.699
Phyhealth [Good] 0.55 0.44 – 0.68 <0.001
ParentSingle [Y] 1.88 1.56 – 2.28 <0.001
Informant Teacher *
Phyhealth [Good]
1.53 1.08 – 2.16 0.016

資料說明:孩子的身體健康程度好的,不太可能被通報有不良行為和攻擊性行為。
單親家庭的孩子,無論是老師或是家長,都更可能通報孩子有不良和攻擊性行為。
且資料表格最後一行更呈現:即使孩子的身體健康程度良好,老師依然更有可能通報這些孩子有不良和攻擊性行為。
孩子身體健康程度對行為問題通報的影響,對父母的影響比對老師的影響更大。

summary(m1)$corr
##       Estimate Std.err
## alpha        0       0

8 Comparison

8.1 teacher

sjPlot::tab_model(mt <- geeglm(Report ~ Phyhealth + ParentSingle, subset = Informant=="Teacher", data=dta1L, id = ID, family = binomial, corstr = "exchangeable"), show.obs=F, show.ngroups = F)
  Report
Predictors Odds Ratios CI p
(Intercept) 0.22 0.18 – 0.27 <0.001
Phyhealth [Good] 0.84 0.64 – 1.10 0.201
ParentSingle [Y] 1.92 1.42 – 2.62 <0.001

8.2 parents

sjPlot::tab_model(mp <- geeglm(Report ~ Phyhealth + ParentSingle, subset = Informant=="Parent", data=dta1L, id = ID, family = binomial, corstr = "exchangeable"), show.obs=F, show.ngroups = F)
  Report
Predictors Odds Ratios CI p
(Intercept) 0.21 0.18 – 0.25 <0.001
Phyhealth [Good] 0.55 0.44 – 0.68 <0.001
ParentSingle [Y] 1.86 1.46 – 2.37 <0.001
sjPlot::tab_model(m1, mt, mp)
  Report Report Report
Predictors Odds Ratios CI p Odds Ratios CI p Odds Ratios CI p
(Intercept) 0.21 0.18 – 0.25 <0.001 0.22 0.18 – 0.27 <0.001 0.21 0.18 – 0.25 <0.001
Informant Teacher 1.05 0.83 – 1.33 0.699
Phyhealth [Good] 0.55 0.44 – 0.68 <0.001 0.84 0.64 – 1.10 0.201 0.55 0.44 – 0.68 <0.001
ParentSingle [Y] 1.88 1.56 – 2.28 <0.001 1.92 1.42 – 2.62 <0.001 1.86 1.46 – 2.37 <0.001
Informant Teacher *
Phyhealth [Good]
1.53 1.08 – 2.16 0.016
N 2501 ID 1428 ID 2501 ID
Observations 3929 1428 2501

比較m1與mp:身體健康程度對家長而言是不影響(數值相同)
比較m1與mt:身體健康程度對教師而言不能說會影響其通報(因未達顯著)

9 Fitted probabilities

dta1L <- data.frame(dta1L, id=row.names(dta1L))

yhat_m1 <- data.frame(id=row.names(fitted(m1)), phat=fitted(m1))

dta1_m1 <- inner_join(dta1L, yhat_m1, by="id")
ggplot(dta1_m1, aes(Phyhealth, Report, color = Informant)) +
 geom_point(aes(Phyhealth, phat),
            position = position_dodge(width=.2),
            size = rel(3), pch = 1) +
 geom_hline(yintercept = c(mean(dta1$Parent, na.rm=T),
                         mean(dta1$Teacher, na.rm=T)), 
            linetype = "dotted")+
 stat_summary(fun.data = "mean_cl_boot", 
              position = position_dodge(width=.2)) +
 facet_wrap( ~ ParentSingle)+
 labs(x = "Physical health", 
      y = "Probability of report") +
 guides(color = guide_legend(reverse=T))+
 theme_minimal()+
 theme(legend.position = c(.1, .9))

10 End