IQ and behavior problem

dta <- read.table("IQ_Beh.txt", header = T,row.names = 1)

確認資料型態。

str(dta)
## 'data.frame':    94 obs. of  3 variables:
##  $ Dep: Factor w/ 2 levels "D","N": 2 2 2 2 1 2 2 2 2 2 ...
##  $ IQ : int  103 124 124 104 96 92 124 99 92 116 ...
##  $ BP : int  4 12 9 3 3 3 6 4 3 9 ...

檢視前六筆資料。

head(dta)
##   Dep  IQ BP
## 1   N 103  4
## 2   N 124 12
## 3   N 124  9
## 4   N 104  3
## 5   D  96  3
## 6   N  92  3

查詢資料結構。

class(dta)
## [1] "data.frame"

顯示變數與個案數

dim(dta)
## [1] 94  3

顯示變數名稱

names(dta)
## [1] "Dep" "IQ"  "BP"

確認變項性質

is.vector(dta$BP)
## [1] TRUE

檢視第一列

dta[1,]
##   Dep  IQ BP
## 1   N 103  4

檢視前三列IQ

dta[1:3,"IQ"]
## [1] 103 124 124

檢視依BP由低至高排序的後六筆資料

tail(dta[order(dta$BP), ])
##    Dep  IQ BP
## 16   N  89 11
## 58   N 117 11
## 66   N 126 11
## 2    N 124 12
## 73   D  99 13
## 12   D  22 17

檢視依BP由高至低排序的後四筆資料

tail(dta[order(-dta$BP), ], 4)
##    Dep  IQ BP
## 77   N 124  1
## 80   N 121  1
## 24   N 106  0
## 75   N 122  0

Histogram

with(dta, hist(IQ, xlab = "IQ", main = ""))

Box plot

with(dta, boxplot(BP ~ Dep, xlab = "Depression" ,ylab = "Behavior problem score"))

Scatter plot

with(dta, plot(IQ ~ BP, pch = 20, col = dta$Dep, 
     xlab = "Behavior problem score" ,ylab = "IQ"))
grid() 

Regression lines

with(dta, plot(BP ~ IQ, type = "n", xlab = "IQ" ,ylab = "Behavior problem" ))
text(dta$IQ, dta$BP, labels = dta$Dep, cex = 0.5)
abline(lm(BP ~ IQ, data = dta, subset = Dep == "D"))
abline(lm(BP ~ IQ, data = dta, subset = Dep == "N"), lty = 2)

  1. Did the two groups of children have different IQ and/or behavioral problems? 進行MANOVA檢定。
options(digits = 4, show.signif.stars = F)
summary(manova(cbind(BP, IQ) ~ Dep, dta), test = "Wilks")
##           Df Wilks approx F num Df den Df Pr(>F)
## Dep        1 0.928     3.53      2     91  0.033
## Residuals 92

IQ與BP上的表現存在顯著差異。

  1. Was there any evidence of a relationship between IQ and behavioral problems?
summary(lm(BP ~ IQ, data = dta))
## 
## Call:
## lm(formula = BP ~ IQ, data = dta)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -5.983 -2.356 -0.411  2.121  7.240 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  13.1828     2.0018    6.59  2.8e-09
## IQ           -0.0679     0.0178   -3.81  0.00025
## 
## Residual standard error: 2.98 on 92 degrees of freedom
## Multiple R-squared:  0.136,  Adjusted R-squared:  0.127 
## F-statistic: 14.5 on 1 and 92 DF,  p-value: 0.000252

IQ與BP為負向預測關係,每單位IQ增加,BP下降0.06792點。