Family Stress and Children’s Speech Difficulties

A linear regression analysis on their relationship

Shuyu Huang - s3743291

Last updated: 14 October, 2018

Introduction

Problem Statement

Data

Data Source

Sampling Method

Data Cont.

Variables

Preprocessing

Descriptive Statistics and Visualisation

SEHQ <- read_csv("/Users/raina/Documents/SEHQ.csv",col_types = cols(Speech = col_number(), 
    Stress = col_number()))
plot(Speech~Stress,data = SEHQ)

Descriptive Statistics and Visualisation Cont.

par(mfrow=c(1,2))
SEHQ$Speech %>% hist(main="Speech")
SEHQ$Stress %>% hist(main="Stress")

Descriptive Statistics and Visualisation Cont.

par(mfrow=c(1,2))
SEHQ$Speech %>% sqrt() %>% hist(main="sqrt(Speech)")
SEHQ$Stress %>% sqrt() %>% hist(main="sqrt(Stress)")

Descriptive Statistics and Visualisation Cont.

plot(sqrt(Speech)~sqrt(Stress),data=SEHQ,ylim=c(0.1,0.7),xlim=c(0.1,0.6))

Hypothesis Testing

model <- lm(sqrt(Speech) ~ sqrt(Stress), data = SEHQ)
model %>% summary()
## 
## Call:
## lm(formula = sqrt(Speech) ~ sqrt(Stress), data = SEHQ)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.116567 -0.031320 -0.003927  0.028647  0.142236 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.21524    0.01814  11.863   <2e-16 ***
## sqrt(Stress)  0.52764    0.05377   9.814   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04529 on 545 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.1502, Adjusted R-squared:  0.1486 
## F-statistic: 96.31 on 1 and 545 DF,  p-value: < 2.2e-16

Hypthesis Testing Cont.

model %>% summary()
## 
## Call:
## lm(formula = sqrt(Speech) ~ sqrt(Stress), data = SEHQ)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.116567 -0.031320 -0.003927  0.028647  0.142236 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.21524    0.01814  11.863   <2e-16 ***
## sqrt(Stress)  0.52764    0.05377   9.814   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.04529 on 545 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.1502, Adjusted R-squared:  0.1486 
## F-statistic: 96.31 on 1 and 545 DF,  p-value: < 2.2e-16
model %>% confint()
##                  2.5 %    97.5 %
## (Intercept)  0.1796031 0.2508817
## sqrt(Stress) 0.4220293 0.6332534

Hypthesis Testing Cont.

Hypthesis Testing Cont.

model %>% plot(which=1)

Hypthesis Testing Cont.

model %>% plot(which=2)

Hypthesis Testing Cont.

model %>% plot(which=3)

Hypthesis Testing Cont.

model %>% plot(which=5)

Hypthesis Testing Cont.

r <- cor(sqrt(SEHQ$Speech),sqrt(SEHQ$Stress),use = "complete.obs")
r
## [1] 0.3875291
CIr(r=r,n=547,level=.95)
## [1] 0.3138915 0.4565325

Hypthesis Testing - Interpretation

Discussion

References