載入資料,進行對於IQ_Beh檔案之結構與變項等之檢視
dta <- read.table("IQ_Beh.txt", header = T, row.names = 1)
# 觀察檔案結構
str(dta)
## 'data.frame': 94 obs. of 3 variables:
## $ Dep: Factor w/ 2 levels "D","N": 2 2 2 2 1 2 2 2 2 2 ...
## $ IQ : int 103 124 124 104 96 92 124 99 92 116 ...
## $ BP : int 4 12 9 3 3 3 6 4 3 9 ...
# 觀察前六筆資料
head(dta)
## Dep IQ BP
## 1 N 103 4
## 2 N 124 12
## 3 N 124 9
## 4 N 104 3
## 5 D 96 3
## 6 N 92 3
# 確立檔案類型
class(dta)
## [1] "data.frame"
# 檢視檔案維度
dim(dta)
## [1] 94 3
# 確認變項名稱
names(dta)
## [1] "Dep" "IQ" "BP"
再進行進一步的檔案變項檢視
# 確認變項性質
is.vector(dta$BP)
## [1] TRUE
# 觀察檔案第一列
dta[1, ]
## Dep IQ BP
## 1 N 103 4
# 檢視前三列的IQ變項
dta[1:3, "IQ"]
## [1] 103 124 124
# 檢視以BP排序較高的頭六筆
tail(dta[order(dta$BP), ])
## Dep IQ BP
## 16 N 89 11
## 58 N 117 11
## 66 N 126 11
## 2 N 124 12
## 73 D 99 13
## 12 D 22 17
# 檢視以BP排序最少的末四筆
tail(dta[order(-dta$BP), ], 4)
## Dep IQ BP
## 77 N 124 1
## 80 N 121 1
## 24 N 106 0
## 75 N 122 0
接續進行圖表繪製
# histogram of IQ
with(dta, hist(IQ, xlab = "IQ", main = ""))
# boxplot of behavior problem by depression status
boxplot(BP ~ Dep, data = dta,
xlab = "Depression",
ylab = "Behavior problem score")
# scatter plot
plot(IQ ~ BP, data = dta, pch = 20, col = dta$Dep,
xlab = "Behavior problem score", ylab = "IQ")
grid()
# two regression lines
plot(BP ~ IQ, data = dta, type = "n",
ylab = "Behavior problem score", xlab = "IQ")
text(dta$IQ, dta$BP, labels = dta$Dep, cex = 0.5)
abline(lm(BP ~ IQ, data = dta, subset = Dep == "D"))
abline(lm(BP ~ IQ, data = dta, subset = Dep == "N"), lty = 2)
(1)第一題詢問學生在IQ與BP上的表現是否存在差異
# 此處進行初步的manova檢定
options(digits = 4, show.signif.stars = F)
summary(manova(cbind(BP, IQ) ~ Dep, dta), test = "Wilks")
## Df Wilks approx F num Df den Df Pr(>F)
## Dep 1 0.928 3.53 2 91 0.033
## Residuals 92
# 確立IQ與BP上的表現存在顯著差異
(2)第二題詢問對於IQ與問題行為BP之間關係的佐證
# 此處建立BP與IQ的線性模式,觀察其係數
summary(lm(BP ~ IQ, data = dta))
##
## Call:
## lm(formula = BP ~ IQ, data = dta)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.983 -2.356 -0.411 2.121 7.240
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.1828 2.0018 6.59 2.8e-09
## IQ -0.0679 0.0178 -3.81 0.00025
##
## Residual standard error: 2.98 on 92 degrees of freedom
## Multiple R-squared: 0.136, Adjusted R-squared: 0.127
## F-statistic: 14.5 on 1 and 92 DF, p-value: 0.000252
# 可得知IQ與BP間存在負向預測關係,當IQ增加一單位,BP會下降0.06792