Survey Data Source: National Household Education Surveys (NHES) Program 2019: Parent and Family Involvement in Education (PFI)
Outcome variable is whether a child enjoys school.
Codebook variable is Item 50: SEENJOY, with levels 1 (Strongly agree) to 4 (Strongly disagree).
I will recode the variable as enjoy_school, with levels 1 and 2 as “1” (YES) and levels 3 and 4 as “0” (NO).
How does parent volunteerism at school affect whether a child enjoys school?
How does a diagnosed developmental delay affect whether a child enjoys school?
Predictor 1: adult_volunteer; Item 60B: FSVOL “… has any adult in this child’s household … served as a volunteer in this child’s classroom or elsewhere in the school?”
Predictor 2: dev_delay; Item 76K: HDDELAYX “Has a health or education professional told you that this child has … a developmental delay?”
library(haven)
library(car)
## Loading required package: carData
library(stargazer)
##
## Please cite as:
## Hlavac, Marek (2018). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.2. https://CRAN.R-project.org/package=stargazer
library(survey)
## Loading required package: grid
## Loading required package: Matrix
## Loading required package: survival
##
## Attaching package: 'survey'
## The following object is masked from 'package:graphics':
##
## dotchart
library(questionr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:car':
##
## recode
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(forcats)
library(tableone)
library(srvyr)
##
## Attaching package: 'srvyr'
## The following object is masked from 'package:stats':
##
## filter
# Read Stata file
pfi19 = read_dta(file = "C:\\UTSA\\OneDrive - University of Texas at San Antonio\\1_M_7283_StatsII\\Homework\\pfi_pu_pert_dat_dta.dta")
# Recode variables
pfi19$enjoy_school <- Recode(pfi19$SEENJOY, recodes="1:2=1; 3:4=0; else=NA")
pfi19$FSVOL <- as.numeric(pfi19$FSVOL)
pfi19$adult_volunteer <- Recode(pfi19$FSVOL, recodes="1='Yes'; 2='No'; else=NA", as.factor=T)
pfi19$HDDELAYX <- as.numeric(pfi19$HDDELAYX)
pfi19$dev_delay <- Recode(pfi19$HDDELAYX, recodes="1='Yes'; 2='No'; else=NA", as.factor=T)
# Filter cases
pfi19 <- pfi19 %>%
filter(is.na(enjoy_school)==F,
is.na(adult_volunteer)==F,
is.na(dev_delay)==F)
library(srvyr)
options(survey.lonely.psu = "adjust")
pfi19design <- svydesign(ids = ~PPSU, strata= ~PSTRATUM, weights = ~FPWT, data = pfi19, nest = TRUE)
pfi19design
## Stratified Independent Sampling design (with replacement)
## svydesign(ids = ~PPSU, strata = ~PSTRATUM, weights = ~FPWT, data = pfi19,
## nest = TRUE)
nodesign <- svydesign(ids = ~1, weights = ~1, data = pfi19)
nodesign
## Independent Sampling design (with replacement)
## svydesign(ids = ~1, weights = ~1, data = pfi19)
vol_deswts <- svyby(formula = ~enjoy_school,
by = ~adult_volunteer,
design = pfi19design,
FUN = svymean)
svychisq(~enjoy_school+adult_volunteer, design = pfi19design)
##
## Pearson's X^2: Rao & Scott adjustment
##
## data: svychisq(~enjoy_school + adult_volunteer, design = pfi19design)
## F = 77.128, ndf = 1, ddf = 15684, p-value < 2.2e-16
knitr::kable(vol_deswts,
caption = "Survey Estimates of Student Enjoying School by Household Adult School Volunteer",
align = 'c',
format = "html")
| adult_volunteer | enjoy_school | se | |
|---|---|---|---|
| No | No | 0.8751728 | 0.0051769 |
| Yes | Yes | 0.9314662 | 0.0038289 |
vol_nodesign <- svyby(formula = ~enjoy_school,
by = ~adult_volunteer,
design = nodesign,
FUN = svymean)
svychisq(~enjoy_school+adult_volunteer, design = nodesign)
##
## Pearson's X^2: Rao & Scott adjustment
##
## data: svychisq(~enjoy_school + adult_volunteer, design = nodesign)
## F = 136.48, ndf = 1, ddf = 15686, p-value < 2.2e-16
knitr::kable(vol_nodesign,
caption = "Estimates of Student Enjoying School by Household Adult School Volunteer - No survey design",
align = 'c',
format = "html")
| adult_volunteer | enjoy_school | se | |
|---|---|---|---|
| No | No | 0.8596530 | 0.0036515 |
| Yes | Yes | 0.9198554 | 0.0033327 |
dev_deswts <- svyby(formula = ~enjoy_school,
by = ~dev_delay,
design = pfi19design,
FUN = svymean)
svychisq(~enjoy_school+dev_delay, design = pfi19design)
##
## Pearson's X^2: Rao & Scott adjustment
##
## data: svychisq(~enjoy_school + dev_delay, design = pfi19design)
## F = 42.74, ndf = 1, ddf = 15684, p-value = 6.445e-11
knitr::kable(dev_deswts,
caption = "Survey Estimates of Student Enjoying School by Developmental Delay diagnosis",
align = 'c',
format = "html")
| dev_delay | enjoy_school | se | |
|---|---|---|---|
| No | No | 0.9024252 | 0.0034473 |
| Yes | Yes | 0.7823739 | 0.0245828 |
dev_nodesign <- svyby(formula = ~enjoy_school,
by = ~dev_delay,
design = nodesign,
FUN = svymean)
svychisq(~enjoy_school+dev_delay, design = nodesign)
##
## Pearson's X^2: Rao & Scott adjustment
##
## data: svychisq(~enjoy_school + dev_delay, design = nodesign)
## F = 39.516, ndf = 1, ddf = 15686, p-value = 3.341e-10
knitr::kable(dev_nodesign,
caption = "Estimates of Student Enjoying School by Developmental Delay diagnosis - No survey design",
align = 'c',
format = "html")
| dev_delay | enjoy_school | se | |
|---|---|---|---|
| No | No | 0.8882637 | 0.0025633 |
| Yes | Yes | 0.8034483 | 0.0165013 |
library(gtsummary)
# for household adult volunteerism
pfi19 %>%
as_survey_design(strata = PSTRATUM,
weights = FPWT) %>%
select(enjoy_school, adult_volunteer) %>%
tbl_svysummary(by = adult_volunteer,
label = list(enjoy_school = "Child Enjoys School")) %>%
add_p()
## i Column(s) enjoy_school are class "haven_labelled". This is an intermediate datastructure not meant for analysis. Convert columns with `haven::as_factor()`, `labelled::to_factor()`, `labelled::unlabelled()`, and `unclass()`. "haven_labelled" value labels are ignored when columns are not converted. Failure to convert may have unintended consequences or result in error.
## * https://haven.tidyverse.org/articles/semantics.html
## * https://larmarange.github.io/labelled/articles/intro_labelled.html#unlabelled
| Characteristic | No, N = 29,773,6531 | Yes, N = 20,925,6171 | p-value2 |
|---|---|---|---|
| Child Enjoys School | <0.001 | ||
| 0 | 3,716,561 (12%) | 1,434,112 (6.9%) | |
| 1 | 26,057,092 (88%) | 19,491,505 (93%) | |
|
1
n (%)
2
chi-squared test with Rao & Scott's second-order correction
|
|||
# for developmental delay
pfi19 %>%
as_survey_design(strata = PSTRATUM,
weights = FPWT) %>%
select(enjoy_school, dev_delay) %>%
tbl_svysummary(by = dev_delay,
label = list(enjoy_school = "Child Enjoys School")) %>%
add_p()
## i Column(s) enjoy_school are class "haven_labelled". This is an intermediate datastructure not meant for analysis. Convert columns with `haven::as_factor()`, `labelled::to_factor()`, `labelled::unlabelled()`, and `unclass()`. "haven_labelled" value labels are ignored when columns are not converted. Failure to convert may have unintended consequences or result in error.
## * https://haven.tidyverse.org/articles/semantics.html
## * https://larmarange.github.io/labelled/articles/intro_labelled.html#unlabelled
| Characteristic | No, N = 49,002,4651 | Yes, N = 1,696,8051 | p-value2 |
|---|---|---|---|
| Child Enjoys School | <0.001 | ||
| 0 | 4,781,404 (9.8%) | 369,269 (22%) | |
| 1 | 44,221,061 (90%) | 1,327,536 (78%) | |
|
1
n (%)
2
chi-squared test with Rao & Scott's second-order correction
|
|||
Yes.
For both levels of the adult_volunteer independent variable, there is a higher proportion of students who enjoy school using survey design compared to not using survey design.
For students with a diagnosed developmental delay, there is a LOWER proportion of students who enjoy school using survey design compared to not using survey design.
As expected, the standard errors are larger in the analysis with survey design compared to not using survey design.