HW Exercise 0316-3
The IQ and behavior problem page has a dataset and a script of R code chunks. Generate a markdown file from the script to push the output in HTML for posting to course Moodle site. Explain what the code chunks do with your comments in the markdown file.
The IQ scores and behavioral problem scores of children at age 5 were examined depending on whether or not their mothers had suffered an episode of post-natal depression. The main questions of interests were: (1) Did the two groups of children have different IQ and/or behavioral problems? (2) Was there any evidence of a relationship between IQ and behavioral problems?
Loading the dataset
check the data structure
確認資料的變數型態
## 'data.frame': 94 obs. of 3 variables:
## $ Dep: Factor w/ 2 levels "D","N": 2 2 2 2 1 2 2 2 2 2 ...
## $ IQ : int 103 124 124 104 96 92 124 99 92 116 ...
## $ BP : int 4 12 9 3 3 3 6 4 3 9 ...
預覽前六筆資料
## Dep IQ BP
## 1 N 103 4
## 2 N 124 12
## 3 N 124 9
## 4 N 104 3
## 5 D 96 3
## 6 N 92 3
資料的類別是data frame
## [1] "data.frame"
檢視資料的維度
## [1] 94 3
資料有94列3欄
檢視資料的變項名稱
## [1] "Dep" "IQ" "BP"
檢視資料中的BP變項是不是向量
## [1] TRUE
檢視資料的第一列
## Dep IQ BP
## 1 N 103 4
檢視資料中IQ變項的第一~第三數值
## [1] 103 124 124
將資料依BP變項的數值依遞增排序,檢視最後六筆資料
## Dep IQ BP
## 16 N 89 11
## 58 N 117 11
## 66 N 126 11
## 2 N 124 12
## 73 D 99 13
## 12 D 22 17
將資料依BP變項的數值依遞減排序,檢視最後四筆資料
## Dep IQ BP
## 77 N 124 1
## 80 N 121 1
## 24 N 106 0
## 75 N 122 0
Data visualization
使用資料中的IQ變項繪製直方圖, X軸名稱設定為IQ,不設定圖的名稱
繪製BP的box plot,使用Dep 為分組,X軸名稱設定為Depression,Y軸名稱設定為Behavior problem score
繪製IQ的散佈圖,X軸為BP,依Dep變項分顏色,X軸名稱為Behavior problem score,Y軸名稱為IQ,並在圖框裡加格線
plot(IQ ~ BP, data = dta, pch = 20, col = dta$Dep,
xlab = "Behavior problem score", ylab = "IQ")
grid()繪製IQ與BP的散佈圖,type=n: not plotting,Y軸名稱為Behavior problem score,X軸名稱為IQ。 text: Add Text to a Plot abline: 依不同dep組別加入y=BP, x=IQ的回歸線,在DEP==“N”這組的回歸線為虛線(lty=2)
plot(BP ~ IQ, data = dta, type = "n",
ylab = "Behavior problem score", xlab = "IQ")
text(dta$IQ, dta$BP, labels = dta$Dep, cex = 0.5)
abline(lm(BP ~ IQ, data = dta, subset = Dep == "D"))
abline(lm(BP ~ IQ, data = dta, subset = Dep == "N"), lty = 2)- Did the two groups of children have different IQ and/or behavioral problems?
##
## Welch Two Sample t-test
##
## data: IQ by Dep
## t = -1.6374, df = 15.53, p-value = 0.1216
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -26.926586 3.490299
## sample estimates:
## mean in group D mean in group N
## 101.0667 112.7848
##
## Welch Two Sample t-test
##
## data: BP by Dep
## t = 1.4924, df = 17.14, p-value = 0.1538
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.6637017 3.8788916
## sample estimates:
## mean in group D mean in group N
## 7.000000 5.392405
兩個t test 皆p-value>0.05,無法拒絕虛無假設,故兩組兒童的IQ、behavioral problems無顯著差異。
- Was there any evidence of a relationship between IQ and behavioral problems?
##
## Pearson's product-moment correlation
##
## data: dta$IQ and dta$BP
## t = -3.8088, df = 92, p-value = 0.0002518
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.5319037 -0.1798969
## sample estimates:
## cor
## -0.3690615
IQ is significantly correlated with behavioral problems.
##
## Call:
## lm(formula = BP ~ IQ, data = dta)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5.9828 -2.3564 -0.4111 2.1210 7.2399
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 13.18280 2.00180 6.585 2.76e-09 ***
## IQ -0.06792 0.01783 -3.809 0.000252 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.983 on 92 degrees of freedom
## Multiple R-squared: 0.1362, Adjusted R-squared: 0.1268
## F-statistic: 14.51 on 1 and 92 DF, p-value: 0.0002518
IQ is significantly associated with behavioral problems.