The assignment consists of 2 parts. Create an R-Markdown script (.rmd) and generate an html output for the code and text. Please keep in mind to use both code chunks, text, and other components of reproducible research as required.
#To get definitions of the columns type help(CO2)
str(CO2)
## Classes 'nfnGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame': 84 obs. of 5 variables:
## $ Plant : Ord.factor w/ 12 levels "Qn1"<"Qn2"<"Qn3"<..: 1 1 1 1 1 1 1 2 2 2 ...
## $ Type : Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ...
## $ Treatment: Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ...
## $ conc : num 95 175 250 350 500 675 1000 95 175 250 ...
## $ uptake : num 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
## - attr(*, "formula")=Class 'formula' length 3 uptake ~ conc | Plant
## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
## - attr(*, "outer")=Class 'formula' length 2 ~Treatment * Type
## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
## - attr(*, "labels")=List of 2
## ..$ x: chr "Ambient carbon dioxide concentration"
## ..$ y: chr "CO2 uptake rate"
## - attr(*, "units")=List of 2
## ..$ x: chr "(uL/L)"
## ..$ y: chr "(umol/m^2 s)"
CO2_summary=CO2 %>% ungroup() %>% group_by(Type, Treatment)%>%
summarise(mean_conc=mean(conc),
std_conc=sd(conc),
mean_uptake=mean(uptake),
std_uptake=sd(uptake)) %>% ungroup()
CO2_summary
## Source: local data frame [4 x 6]
##
## Type Treatment mean_conc std_conc mean_uptake std_uptake
## (fctr) (fctr) (dbl) (dbl) (dbl) (dbl)
## 1 Quebec nonchilled 435 301.4216 35.33333 9.596371
## 2 Quebec chilled 435 301.4216 31.75238 9.644823
## 3 Mississippi nonchilled 435 301.4216 25.95238 7.402136
## 4 Mississippi chilled 435 301.4216 15.81429 4.058976
# Perform one-way t test on uptake
fit_Type=lm(uptake~Type, data=CO2)
anova(fit_Type)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 1 3365.5 3365.5 43.519 3.835e-09 ***
## Residuals 82 6341.4 77.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
fit_Treatment=lm(uptake~Treatment, data=CO2)
anova(fit_Treatment)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Treatment 1 988.1 988.11 9.2931 0.003096 **
## Residuals 82 8718.9 106.33
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In test for Type, p-value = 3.834686110^{-9}. Therefore, we could reject the null hypothesis and conclude the origin of the plant made a difference on the uptake rates.
In test for Treatent, p-value = 0.0030957, Therefore, we could reject the null hypothesis and conclude treatment types made a difference on the uptake rates.
fit_Type_Treatment=lm(uptake~Type*Treatment, data=CO2)
anova(fit_Type_Treatment)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 1 3365.5 3365.5 52.5086 2.378e-10 ***
## Treatment 1 988.1 988.1 15.4164 0.0001817 ***
## Type:Treatment 1 225.7 225.7 3.5218 0.0642128 .
## Residuals 80 5127.6 64.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# A
tb_vs_am=with(mtcars, table(vs,am))
# B
tb_gear_carb=with(mtcars, table(gear, carb))
# C
tb_cyl_gear=with(mtcars, table(cyl, gear))
tb_vs_am
## am
## vs 0 1
## 0 12 6
## 1 7 7
tb_gear_carb
## carb
## gear 1 2 3 4 6 8
## 3 3 4 3 5 0 0
## 4 4 4 0 4 0 0
## 5 0 2 0 1 1 1
tb_cyl_gear
## gear
## cyl 3 4 5
## 4 1 8 2
## 6 2 4 1
## 8 12 0 2
My guess is none of each two of them are dependent with each other.
chisq.test(tb_vs_am)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: tb_vs_am
## X-squared = 0.3475, df = 1, p-value = 0.5555
chisq.test(tb_gear_carb)
## Warning in chisq.test(tb_gear_carb): Chi-squared approximation may be
## incorrect
##
## Pearson's Chi-squared test
##
## data: tb_gear_carb
## X-squared = 16.5181, df = 10, p-value = 0.08573
chisq.test(tb_cyl_gear)
## Warning in chisq.test(tb_cyl_gear): Chi-squared approximation may be
## incorrect
##
## Pearson's Chi-squared test
##
## data: tb_cyl_gear
## X-squared = 18.0364, df = 4, p-value = 0.001214