Mother-child interactuions were studied and the mother’s comments were classified as either statements or as questions. The child’s vocubulary was evaluated and scored. A camparison between the mother’s statement type and the child’s vocab score undertaken
foi <- "/Users/mqbxgjk2/Desktop/Lily_Project/data.txt"
dat <- read.table(foi, sep = "\t", header = TRUE)
head(dat)
## ID MOT_questions MOT_statements CHI_VoCD
## 1 45 53 31 11.95
## 2 47 38 31 39.37
## 3 49 81 44 55.53
## 4 55 39 57 29.64
## 5 25 36 47 58.90
## 6 57 43 90 44.24
The columns in the data are pair identification number (‘ID’), statements by the mother posed as questions (‘MOT_questions’) and those posed as statements (‘MOT_statements’), the childs vocab score is contained in the column ‘CHI_VoCD’.
Next we examine the distribution of the data. Below are some box-and-whisker plots or Boxplots. These are interpreted as follows: the box represents the interquartile range (IQR), the median is indicated by a heavy solid line within the box, and the whiskers extend to the furthest data point within 1.5 times the IQR from the quartiles, but no further. Points outside the whiskers are identified as outliers.
titles <- NULL
par(las =1, bty ='n', mfrow=c(3,1))
for(ii in 2:4){
tmp <- gsub("_", " ", colnames(dat)[ii])
titles <- c(titles, tmp)
boxplot(dat[,ii], horizontal=TRUE, xlab= tmp, width = 1.5)
print(summary(dat[,ii]))
}
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 18.0 31.0 43.0 51.7 70.0 99.0
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 6.00 25.75 44.50 44.70 52.50 125.00
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 11.95 33.90 44.63 44.90 57.05 78.19
First we compare how questions affect vocab, and then statements. We see that
par(las =1, bty ='n', mfrow=c(1,2))
plot(dat$MOT_questions, dat$CHI_VoCD, xlab = titles[1], ylab = titles[3], xlim=c(0, 130))
mod1 <- lm(dat$CHI_VoCD~dat$MOT_questions)
abline(mod1)
print(titles[1])
## [1] "MOT questions"
summary(mod1)
##
## Call:
## lm(formula = dat$CHI_VoCD ~ dat$MOT_questions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -33.051 -11.369 -1.565 9.762 33.962
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 40.90201 9.21555 4.438 0.000317 ***
## dat$MOT_questions 0.07734 0.16066 0.481 0.636036
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 17.85 on 18 degrees of freedom
## Multiple R-squared: 0.01271, Adjusted R-squared: -0.04214
## F-statistic: 0.2317 on 1 and 18 DF, p-value: 0.636
plot(dat$MOT_statements, dat$CHI_VoCD,xlab = titles[2], ylab = titles[3], xlim=c(0, 130))
mod2 <- lm(dat$CHI_VoCD~dat$MOT_statements)
abline(mod2)
print(titles[2])
## [1] "MOT statements"
summary(mod2)
##
## Call:
## lm(formula = dat$CHI_VoCD ~ dat$MOT_statements)
##
## Residuals:
## Min 1Q Median 3Q Max
## -30.195 -9.930 -2.560 6.466 38.257
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 35.9114 7.2463 4.956 0.000102 ***
## dat$MOT_statements 0.2011 0.1380 1.457 0.162373
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.99 on 18 degrees of freedom
## Multiple R-squared: 0.1055, Adjusted R-squared: 0.05578
## F-statistic: 2.123 on 1 and 18 DF, p-value: 0.1624
Then we compare
par(las =1, bty ='n', mfrow=c(1,1))
plot(dat$MOT_statements, dat$MOT_questions,xlab = titles[2], ylab = titles[1],xlim=c(0, 130),ylim=c(0, 130))
mod3 <- lm(dat$MOT_questions~dat$MOT_statements)
summary(mod3)
##
## Call:
## lm(formula = dat$MOT_questions ~ dat$MOT_statements)
##
## Residuals:
## Min 1Q Median 3Q Max
## -34.164 -16.664 -8.089 17.175 47.193
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 35.7408 10.2578 3.484 0.00265 **
## dat$MOT_statements 0.3570 0.1954 1.827 0.08430 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 24.06 on 18 degrees of freedom
## Multiple R-squared: 0.1565, Adjusted R-squared: 0.1096
## F-statistic: 3.339 on 1 and 18 DF, p-value: 0.0843
abline(mod3)
Then we carry out a linear regression of both predictors
mod <- lm(dat[,4] ~dat$MOT_statements+dat$MOT_questions)
summary(mod)
##
## Call:
## lm(formula = dat[, 4] ~ dat$MOT_statements + dat$MOT_questions)
##
## Residuals:
## Min 1Q Median 3Q Max
## -30.116 -10.216 -2.103 6.830 38.258
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 36.36835 9.64697 3.770 0.00153 **
## dat$MOT_statements 0.20566 0.15462 1.330 0.20105
## dat$MOT_questions -0.01278 0.17130 -0.075 0.94138
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 17.48 on 17 degrees of freedom
## Multiple R-squared: 0.1058, Adjusted R-squared: 0.000569
## F-statistic: 1.005 on 2 and 17 DF, p-value: 0.3866
Correlations
cor.test in Rcor.test, which tells you whether the observed correlation
is statistically significant.cor.test(dat$MOT_questions, dat$CHI_VoCD)
##
## Pearson's product-moment correlation
##
## data: dat$MOT_questions and dat$CHI_VoCD
## t = 0.4814, df = 18, p-value = 0.636
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3470951 0.5288772
## sample estimates:
## cor
## 0.1127425
print("WEAK")
## [1] "WEAK"
cor.test(dat$MOT_statements, dat$CHI_VoCD)
##
## Pearson's product-moment correlation
##
## data: dat$MOT_statements and dat$CHI_VoCD
## t = 1.4569, df = 18, p-value = 0.1624
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1375077 0.6708779
## sample estimates:
## cor
## 0.3247757
print("MODERATE to WEAK")
## [1] "MODERATE to WEAK"
cor.test(dat$MOT_questions, dat$MOT_statements)
##
## Pearson's product-moment correlation
##
## data: dat$MOT_questions and dat$MOT_statements
## t = 1.8272, df = 18, p-value = 0.0843
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.05694214 0.71322546
## sample estimates:
## cor
## 0.3955456
print("MODERATE")
## [1] "MODERATE"