The IQ scores and behavioral problem scores of children at age 5 were examined depending on whether or not their mothers had suffered an episode of post-natal depression. The main questions of interests were:
(1) Did the two groups of children have different IQ and/or behavioral problems?
(2) Was there any evidence of a relationship between IQ and behavioral problems?

Source: Dr. C. Kumar, Institute of Psychiatry, Lonon.

Load in the data set and check its structure

  • Load in the data set and named it dta.
  • Find the class of dta.
[1] "data.frame"

dta is a data frame.

  • Check the structure of dta.
'data.frame':   94 obs. of  3 variables:
 $ Dep: Factor w/ 2 levels "D","N": 2 2 2 2 1 2 2 2 2 2 ...
 $ IQ : int  103 124 124 104 96 92 124 99 92 116 ...
 $ BP : int  4 12 9 3 3 3 6 4 3 9 ...

dta is a data frame with 94 observations and 3 variables: Dep, IQ, and BP. Dep is a factorial variable. IQ and BP are numerical variable with integers.

  • Check the dimension of dta.
[1] 94  3

dta has 94 rows and 3 columns.

  • Find the names of the columns in dta.
[1] "Dep" "IQ"  "BP" 
  • Check if BP, a variable in dta, is a vector or not.
[1] TRUE

BP, a variable in dta, is a vector.

  • Take the \(1^{st}\) row of dta, it is a slice of dta.
  • Take the \(1^{st} - 3^{rd}\) elements of IQ, a variable in dta. It is a vector.
[1] 103 124 124
  • Sort dta in the ascending order of variable BP, and take the last 6 rows. In other words, this code takes the data of the 6 highest BP.
  • Sort dta in the descending order of variable BP, and take the last 4 rows. In other words, this code takes the data of the 4 lowest BP.



Data visualization

  • Draw the histogram of ˋˋdta$IQˋˋ with x-axis name, “IQ” and without the title.

  • Draw the box plots of dta$BP in groups of dta$Dep with names of x-axis and y-axis.

Compare to participants in the group of non-depression, participant in the group of depression seemed to have higher levels of behavioral problems.

  • Draw the scatter plot of dta$IQ and dta$BP with different colors representing different groups of Dep. Set the point style and add a grid on the plot.

  • Draw the scatter plot of dta$BP and dta$IQ with point labels of Dep. Add the regression lines of y=BP, x=IQ with different types representing different groups of Dep.

Compare to the group of non-depression, there were a more negative correlation (i.e., \(r<0\)) between IQ and behavioral problems in the group of depression.



Hypothesis testing

For question (a): Independent two sample t-test

  1. Did the two groups of children have different IQ?

\(H_0: \mu_{IQ(D)} = \mu_{IQ(N)}\)
\(H_1: \mu_{IQ(D)} \neq \mu_{IQ(N)}\)


    Welch Two Sample t-test

data:  IQ by Dep
t = -1.6374, df = 15.53, p-value = 0.1216
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -26.926586   3.490299
sample estimates:
mean in group D mean in group N 
       101.0667        112.7848 

Since \(p > \alpha =.05\), we retain \(H_0\). Two groups of children did not have significantly different IQ.

  1. Did the two groups of children have different behavioral problems?

\(H_0: \mu_{BP(D)} = \mu_{BP(N)}\)
\(H_1: \mu_{BP(D)} \neq \mu_{BP(N)}\)


    Welch Two Sample t-test

data:  BP by Dep
t = 1.4924, df = 17.14, p-value = 0.1538
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.6637017  3.8788916
sample estimates:
mean in group D mean in group N 
       7.000000        5.392405 

Since \(p > \alpha =.05\), we retain \(H_0\). Two groups of children did not have significantly different behavioral problems.

For question (b): Correlation test and linar regression analysis

Was there any evidence of a relationship between IQ and behavioral problems?

Correlation test

\(H_0: \phi_{IQ, BP} = 0\)
\(H_1: \phi_{IQ, BP} \neq 0\)


    Pearson's product-moment correlation

data:  dta$IQ and dta$BP
t = -3.8088, df = 92, p-value = 0.0002518
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.5319037 -0.1798969
sample estimates:
       cor 
-0.3690615 

Since \(p < \alpha =.05\), we reject \(H_0\). IQ is significantly correlated with behavioral problems.

Regression analysis

\(H_0: \beta_{IQ} = 0\)
\(H_1: \beta_{IQ} \neq 0\)


Call:
lm(formula = BP ~ IQ, data = dta)

Residuals:
    Min      1Q  Median      3Q     Max 
-5.9828 -2.3564 -0.4111  2.1210  7.2399 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 13.18280    2.00180   6.585 2.76e-09 ***
IQ          -0.06792    0.01783  -3.809 0.000252 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.983 on 92 degrees of freedom
Multiple R-squared:  0.1362,    Adjusted R-squared:  0.1268 
F-statistic: 14.51 on 1 and 92 DF,  p-value: 0.0002518

Since \(p < \alpha =.05\), we reject \(H_0\). IQ is significantly associated with behavioral problems.