The IQ scores and behavioral problem scores of children at age 5 were examined depending on whether or not their mothers had suffered an episode of post-natal depression.
讀入資料
dta <- read.table("D:/201802/data_management/w3/IQ_Beh.txt", header = T, row.names = 1)
str(dta)
'data.frame': 94 obs. of 3 variables:
$ Dep: Factor w/ 2 levels "D","N": 2 2 2 2 1 2 2 2 2 2 ...
$ IQ : int 103 124 124 104 96 92 124 99 92 116 ...
$ BP : int 4 12 9 3 3 3 6 4 3 9 ...
head(dta)
Dep IQ BP
1 N 103 4
2 N 124 12
3 N 124 9
4 N 104 3
5 D 96 3
6 N 92 3
class(dta)
[1] "data.frame"
dim(dta)
[1] 94 3
names(dta)
[1] "Dep" "IQ" "BP"
is.vector(dta$BP)
[1] TRUE
dta[1, ]
Dep IQ BP
1 N 103 4
dta[1:3, "IQ"]
[1] 103 124 124
tail(dta[order(dta$BP), ])
Dep IQ BP
16 N 89 11
58 N 117 11
66 N 126 11
2 N 124 12
73 D 99 13
12 D 22 17
tail(dta[order(-dta$BP), ], 4)
Dep IQ BP
77 N 124 1
80 N 121 1
24 N 106 0
75 N 122 0
作圖
# histogram of IQ
with(dta, hist(IQ, xlab = "IQ", main = ""))
# boxplot of behavior problem by depression status
boxplot(BP ~ Dep, data = dta,
xlab = "Depression",
ylab = "Behavior problem score")
# scatter plot
plot(IQ ~ BP, data = dta, pch = 20, col = dta$Dep,
xlab = "Behavior problem score", ylab = "IQ")
grid()
# two regression lines
plot(BP ~ IQ, data = dta, type = "n",
ylab = "Behavior problem score", xlab = "IQ")
text(dta$IQ, dta$BP, labels = dta$Dep, cex = 0.5)
abline(lm(BP ~ IQ, data = dta, subset = Dep == "D"))
abline(lm(BP ~ IQ, data = dta, subset = Dep == "N"), lty = 2)
summary(manova(cbind(BP, IQ) ~ Dep, data = dta))
Df Pillai approx F num Df den Df Pr(>F)
Dep 1 0.072039 3.5322 2 91 0.03331 *
Residuals 92
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(lm(IQ ~ BP, data = dta))
Call:
lm(formula = IQ ~ BP, data = dta)
Residuals:
Min 1Q Median 3Q Max
-66.153 -9.223 0.773 8.788 28.789
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 122.2425 3.4115 35.832 < 2e-16 ***
BP -2.0053 0.5265 -3.809 0.000252 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 16.21 on 92 degrees of freedom
Multiple R-squared: 0.1362, Adjusted R-squared: 0.1268
F-statistic: 14.51 on 1 and 92 DF, p-value: 0.0002518