The data are obtained from two surveys of children’s mental health. The data set consists of 1,428 children who had both parent and teacher responses, and an additional 1,073 children with only a parent response. In the psychiatric literature, these sources of respondents are often referred to as “informants”.
A standardized measure of childhood psychopathology was used both by parents (Child Behavior Checklist, CBC) and teachers (Teacher Report Forms, TRF) to assess children in the study.
The response variable is derived from the externalizing scale, which assesses delinquent and aggressive behavior. The scale has been dichotomized at the cutpoint for borderline/clinical psychopathology.
Source: Zahner, G.E.P., Jacobs, J.H., Freeman, D.H. & Trainor, K.F. (1993). Rural-urban child psychopathology in a northeastern U.S. State: 1986-1989. Journal of the American Academy of Child and Adolescent Psychiatry, 32, 378-387. Reported in Fitzmaurice, G.M., Laird, N.M., & Ware, J.H. (2004). Applied Longitudinal Analysis. p. 434-439.
Column 1: Child ID Column 2: Child’s physical health problems: Fair = Fair to bad , Good Column 3: Single parent status: Single, Otherwise Column 4: Externalizing behavior (Parent report) Column 5: Externalizing behavior (Teachert report)
#Loading package
require(pacman)
## Loading required package: pacman
pacman::p_load(MASS, gee, tidyverse, geepack)
# input data
dta <- read.table("ccs-data.txt", h=F, na.strings='.')
head(dta)
## V1 V2 V3 V4 V5
## 1 1 1 1 0 NA
## 2 2 0 0 1 NA
## 3 3 1 0 0 NA
## 4 4 1 1 1 NA
## 5 5 1 0 0 NA
## 6 6 1 0 0 NA
#將變數重新命名
names(dta) <- c("ID", "Phyhealth", "ParentSingle", "Parent", "Teacher")
#
dta <- dta %>%
mutate(ID = factor(ID),
Phyhealth = factor(ifelse(Phyhealth == 1, "Bad", "Good")),
ParentSingle = factor(ifelse(ParentSingle == 1, "Y", "N")))
#
head(dta)
## ID Phyhealth ParentSingle Parent Teacher
## 1 1 Bad Y 0 NA
## 2 2 Good N 1 NA
## 3 3 Bad N 0 NA
## 4 4 Bad Y 1 NA
## 5 5 Bad N 0 NA
## 6 6 Bad N 0 NA
#
str(dta)
## 'data.frame': 2501 obs. of 5 variables:
## $ ID : Factor w/ 2501 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ Phyhealth : Factor w/ 2 levels "Bad","Good": 1 2 1 1 1 1 1 2 2 2 ...
## $ ParentSingle: Factor w/ 2 levels "N","Y": 2 1 1 2 1 1 1 1 2 1 ...
## $ Parent : int 0 1 0 1 0 0 0 0 0 0 ...
## $ Teacher : int NA NA NA NA NA NA NA 0 NA 0 ...
#寬資料轉換成長資料
dtaL <- gather(dta, key=Informant, value=Report,
Parent:Teacher, factor_key=TRUE)
head(dtaL,10)
## ID Phyhealth ParentSingle Informant Report
## 1 1 Bad Y Parent 0
## 2 2 Good N Parent 1
## 3 3 Bad N Parent 0
## 4 4 Bad Y Parent 1
## 5 5 Bad N Parent 0
## 6 6 Bad N Parent 0
## 7 7 Bad N Parent 0
## 8 8 Good N Parent 0
## 9 9 Good Y Parent 0
## 10 10 Good N Parent 0
str(dtaL)
## 'data.frame': 5002 obs. of 5 variables:
## $ ID : Factor w/ 2501 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ Phyhealth : Factor w/ 2 levels "Bad","Good": 1 2 1 1 1 1 1 2 2 2 ...
## $ ParentSingle: Factor w/ 2 levels "N","Y": 2 1 1 2 1 1 1 1 2 1 ...
## $ Informant : Factor w/ 2 levels "Parent","Teacher": 1 1 1 1 1 1 1 1 1 1 ...
## $ Report : int 0 1 0 1 0 0 0 0 0 0 ...
with(dta, ftable(Teacher,Parent))
## Parent 0 1
## Teacher
## 0 1030 129
## 1 165 104
老師報告學生問題行為的可能性大於父母。
# correlation
cor(dta[,4:5], use="pair")
## Parent Teacher
## Parent 1.0000000 0.2913296
## Teacher 0.2913296 1.0000000
老師報告學生問題行為與父母報告孩子問題行為的相關係數為0.291
# proportion of missing responses
sum(is.na(dta[,4]) | is.na(dta[,5]))/dim(dta)[1]
## [1] 0.4290284
missing data比率約有42.9%
sjPlot::tab_model(m1 <- geeglm(Report ~ Informant + Phyhealth + ParentSingle + Informant:Phyhealth, data=dtaL, id = ID, family = binomial, corstr = "exchangeable"), show.obs=F, show.ngroups = F)
Report | |||
---|---|---|---|
Predictors | Odds Ratios | CI | p |
(Intercept) | 0.21 | 0.18 – 0.25 | <0.001 |
Informant [Teacher] | 1.05 | 0.83 – 1.33 | 0.699 |
Phyhealth [Good] | 0.55 | 0.44 – 0.68 | <0.001 |
ParentSingle [Y] | 1.88 | 1.56 – 2.28 | <0.001 |
Informant [Teacher] * Phyhealth [Good] |
1.53 | 1.08 – 2.16 | 0.016 |
1.身體健康好的學生,不太可能被通報有問題行為。 2.單親家庭的孩子,無論是老師或是家長,都更可能通報孩子有問題行為。
summary(m1)$corr
## Estimate Std.err
## alpha 0 0
##Teacher
sjPlot::tab_model(mt <- geeglm(Report ~ Phyhealth + ParentSingle, subset = Informant=="Teacher", data=dtaL, id = ID, family = binomial, corstr = "exchangeable"), show.obs=F, show.ngroups = F)
Report | |||
---|---|---|---|
Predictors | Odds Ratios | CI | p |
(Intercept) | 0.22 | 0.18 – 0.27 | <0.001 |
Phyhealth [Good] | 0.84 | 0.64 – 1.10 | 0.201 |
ParentSingle [Y] | 1.92 | 1.42 – 2.62 | <0.001 |
##Parent
sjPlot::tab_model(mp <- geeglm(Report ~ Phyhealth + ParentSingle, subset = Informant=="Parent", data=dtaL, id = ID, family = binomial, corstr = "exchangeable"), show.obs=F, show.ngroups = F)
Report | |||
---|---|---|---|
Predictors | Odds Ratios | CI | p |
(Intercept) | 0.21 | 0.18 – 0.25 | <0.001 |
Phyhealth [Good] | 0.55 | 0.44 – 0.68 | <0.001 |
ParentSingle [Y] | 1.86 | 1.46 – 2.37 | <0.001 |
sjPlot::tab_model(m1, mt, mp)
Report | Report | Report | |||||||
---|---|---|---|---|---|---|---|---|---|
Predictors | Odds Ratios | CI | p | Odds Ratios | CI | p | Odds Ratios | CI | p |
(Intercept) | 0.21 | 0.18 – 0.25 | <0.001 | 0.22 | 0.18 – 0.27 | <0.001 | 0.21 | 0.18 – 0.25 | <0.001 |
Informant [Teacher] | 1.05 | 0.83 – 1.33 | 0.699 | ||||||
Phyhealth [Good] | 0.55 | 0.44 – 0.68 | <0.001 | 0.84 | 0.64 – 1.10 | 0.201 | 0.55 | 0.44 – 0.68 | <0.001 |
ParentSingle [Y] | 1.88 | 1.56 – 2.28 | <0.001 | 1.92 | 1.42 – 2.62 | <0.001 | 1.86 | 1.46 – 2.37 | <0.001 |
Informant [Teacher] * Phyhealth [Good] |
1.53 | 1.08 – 2.16 | 0.016 | ||||||
N | 2501 ID | 1428 ID | 2501 ID | ||||||
Observations | 3929 | 1428 | 2501 |
比較m1與mp:身體健康程度對家長而言是不影響(數值相同) 比較m1與mt:身體健康程度對教師而言不能說會影響其通報(因未達顯著)
dtaL <- data.frame(dtaL, id=row.names(dtaL))
yhat_m1 <- data.frame(id=row.names(fitted(m1)), phat=fitted(m1))
dta_m1 <- inner_join(dtaL, yhat_m1, by="id")
ggplot(dta_m1, aes(Phyhealth, Report, color = Informant)) +
geom_point(aes(Phyhealth, phat),
position = position_dodge(width=.2),
size = rel(3), pch = 1) +
geom_hline(yintercept = c(mean(dta$Parent, na.rm=T),
mean(dta$Teacher, na.rm=T)),
linetype = "dotted")+
stat_summary(fun.data = "mean_cl_boot",
position = position_dodge(width=.2)) +
facet_wrap( ~ ParentSingle)+
labs(x = "Physical health",
y = "Probability of report") +
guides(color = guide_legend(reverse=T))+
theme_minimal()+
theme(legend.position = c(.1, .9))
#The End