calcualte means & standard deviation
help(CO2)
## starting httpd help server ...
## done
summary(CO2)
## Plant Type Treatment conc
## Qn1 : 7 Quebec :42 nonchilled:42 Min. : 95
## Qn2 : 7 Mississippi:42 chilled :42 1st Qu.: 175
## Qn3 : 7 Median : 350
## Qc1 : 7 Mean : 435
## Qc3 : 7 3rd Qu.: 675
## Qc2 : 7 Max. :1000
## (Other):42
## uptake
## Min. : 7.70
## 1st Qu.:17.90
## Median :28.30
## Mean :27.21
## 3rd Qu.:37.12
## Max. :45.50
##
colnames(CO2)
## [1] "Plant" "Type" "Treatment" "conc" "uptake"
m1<-subset(CO2,Type=="Quebec" & Treatment=="nonchilled")
m2<-subset(CO2,Type=="Quebec" & Treatment=="chilled")
m3<-subset(CO2, Type=="Mississippi" & Treatment=="nonchilled")
m4<-subset(CO2, Type=="Mississippi" & Treatment=="chilled")
mean(m1$conc)
## [1] 435
mean(m2$conc)
## [1] 435
mean(m3$conc)
## [1] 435
mean(m4$conc)
## [1] 435
mean(m1$uptake)
## [1] 35.33333
mean(m2$uptake)
## [1] 31.75238
mean(m3$uptake)
## [1] 25.95238
mean(m4$uptake)
## [1] 15.81429
sd(m1$conc)
## [1] 301.4216
sd(m2$conc)
## [1] 301.4216
sd(m3$conc)
## [1] 301.4216
sd(m4$conc)
## [1] 301.4216
sd(m1$uptake)
## [1] 9.596371
sd(m2$uptake)
## [1] 9.644823
sd(m3$uptake)
## [1] 7.402136
sd(m4$uptake)
## [1] 4.058976
Perform one-way test for type and treatment on uptake, since the mean for conc is the same, it is meanless to test it
trt=lm(uptake~Treatment, CO2)
anova(trt)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Treatment 1 988.1 988.11 9.2931 0.003096 **
## Residuals 82 8718.9 106.33
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
type=lm(uptake~Type,data=CO2)
anova(type)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 1 3365.5 3365.5 43.519 3.835e-09 ***
## Residuals 82 6341.4 77.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
since both of the p-value<0.05, we conclude we can reject the null hypthoese, which means both the type and the treatment makes a difference in uptake
Perform two way anova test
tt=lm(uptake~Type*Treatment,data=CO2)
anova(tt)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 1 3365.5 3365.5 52.5086 2.378e-10 ***
## Treatment 1 988.1 988.1 15.4164 0.0001817 ***
## Type:Treatment 1 225.7 225.7 3.5218 0.0642128 .
## Residuals 80 5127.6 64.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
we conclude both type and treatment makes a difference in uptake individually, however, we can’s conclude the interaction between type and treatment make a differences on the variable uptake
let’s take a look at the frequency counts with different combinations
carvsam<- table(mtcars$vs,mtcars$am)
carvsam
##
## 0 1
## 0 12 6
## 1 7 7
cargc <-table(mtcars$gear,mtcars$carb)
cargc
##
## 1 2 3 4 6 8
## 3 3 4 3 5 0 0
## 4 4 4 0 4 0 0
## 5 0 2 0 1 1 1
carcg<- table(mtcars$cyl, mtcars$gear)
carcg
##
## 3 4 5
## 4 1 8 2
## 6 2 4 1
## 8 12 0 2
We are guessing that each variable is independent to each other
performing the chi-square test
chisq.test(carvsam)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: carvsam
## X-squared = 0.34754, df = 1, p-value = 0.5555
chisq.test(cargc)
## Warning in chisq.test(cargc): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: cargc
## X-squared = 16.518, df = 10, p-value = 0.08573
chisq.test(carcg)
## Warning in chisq.test(carcg): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: carcg
## X-squared = 18.036, df = 4, p-value = 0.001214
As a result, we can say for sure vs and am, gear and carb are independent from each other, however cyl and gear might be dependent , but we can draw the conclusion with enough confidence, betweeen the sample size is too small