The assignment consists of 2 parts. Create an R-Markdown script (.rmd) and generate an html output for the code and text. Please keep in mind to use both code chunks, text, and other components of reproducible research as required.
str(CO2)
## Classes 'nfnGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame': 84 obs. of 5 variables:
## $ Plant : Ord.factor w/ 12 levels "Qn1"<"Qn2"<"Qn3"<..: 1 1 1 1 1 1 1 2 2 2 ...
## $ Type : Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ...
## $ Treatment: Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ...
## $ conc : num 95 175 250 350 500 675 1000 95 175 250 ...
## $ uptake : num 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
## - attr(*, "formula")=Class 'formula' language uptake ~ conc | Plant
## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
## - attr(*, "outer")=Class 'formula' language ~Treatment * Type
## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
## - attr(*, "labels")=List of 2
## ..$ x: chr "Ambient carbon dioxide concentration"
## ..$ y: chr "CO2 uptake rate"
## - attr(*, "units")=List of 2
## ..$ x: chr "(uL/L)"
## ..$ y: chr "(umol/m^2 s)"
summary(CO2)
## Plant Type Treatment conc
## Qn1 : 7 Quebec :42 nonchilled:42 Min. : 95
## Qn2 : 7 Mississippi:42 chilled :42 1st Qu.: 175
## Qn3 : 7 Median : 350
## Qc1 : 7 Mean : 435
## Qc3 : 7 3rd Qu.: 675
## Qc2 : 7 Max. :1000
## (Other):42
## uptake
## Min. : 7.70
## 1st Qu.:17.90
## Median :28.30
## Mean :27.21
## 3rd Qu.:37.12
## Max. :45.50
##
colnames(CO2)
## [1] "Plant" "Type" "Treatment" "conc" "uptake"
aggregate(CO2[, "uptake"], list(CO2[, "Type"], CO2[, "Treatment"]), mean)
## Group.1 Group.2 x
## 1 Quebec nonchilled 35.33333
## 2 Mississippi nonchilled 25.95238
## 3 Quebec chilled 31.75238
## 4 Mississippi chilled 15.81429
aggregate(CO2[, "uptake"], list(CO2[, "Type"], CO2[, "Treatment"]), sd)
## Group.1 Group.2 x
## 1 Quebec nonchilled 9.596371
## 2 Mississippi nonchilled 7.402136
## 3 Quebec chilled 9.644823
## 4 Mississippi chilled 4.058976
fit_Type=lm(uptake~Type, data=CO2)
anova(fit_Type)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 1 3365.5 3365.5 43.519 3.835e-09 ***
## Residuals 82 6341.4 77.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
fit_Treatment=lm(uptake~Treatment, data=CO2)
anova(fit_Treatment)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Treatment 1 988.1 988.11 9.2931 0.003096 **
## Residuals 82 8718.9 106.33
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
.The p-value = 3.84. Therefore, Null hypothesis is rejected and the uptake rates is dependent on Plant origin.
.In test for Treatment, p-value = 0.0030957, Null hypothesis is rejected, Uptake rates is dependent on Treatment types. ## 4.Perform a two-way test for Type and Treatment
fit_Type_Treatment=lm(uptake~Type*Treatment, data=CO2)
anova(fit_Type_Treatment)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 1 3365.5 3365.5 52.5086 2.378e-10 ***
## Treatment 1 988.1 988.1 15.4164 0.0001817 ***
## Type:Treatment 1 225.7 225.7 3.5218 0.0642128 .
## Residuals 80 5127.6 64.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
. Because the p-value for both Type and Treatment is very low, the Null hypothesis is rejected and both the Type and the Treatment effcet the uptake rate. . Because the p-value for the interaction term is 0.06, Null is accepted and conclude that the no significant effect on uptake rates by Type and Treatment. ### 2.Use the mtcars dataset in R Use the table() function with the following combinations 1.The variables vs and am
VsTransmission=with(mtcars, table(vs,am))
VsTransmission
## am
## vs 0 1
## 0 12 6
## 1 7 7
2.The variables gear and carb
GearCarburetors=with(mtcars, table(gear, carb))
GearCarburetors
## carb
## gear 1 2 3 4 6 8
## 3 3 4 3 5 0 0
## 4 4 4 0 4 0 0
## 5 0 2 0 1 1 1
3.The variables cyl and gear
CylindersGear=with(mtcars, table(cyl, gear))
CylindersGear
## gear
## cyl 3 4 5
## 4 1 8 2
## 6 2 4 1
## 8 12 0 2
chisq.test(VsTransmission)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: VsTransmission
## X-squared = 0.34754, df = 1, p-value = 0.5555
chisq.test(GearCarburetors)
## Warning in chisq.test(GearCarburetors): Chi-squared approximation may be
## incorrect
##
## Pearson's Chi-squared test
##
## data: GearCarburetors
## X-squared = 16.518, df = 10, p-value = 0.08573
chisq.test(CylindersGear)
## Warning in chisq.test(CylindersGear): Chi-squared approximation may be
## incorrect
##
## Pearson's Chi-squared test
##
## data: CylindersGear
## X-squared = 18.036, df = 4, p-value = 0.001214
. As the P value for VsTransmission and GearCarburetors are higher than 0.05, the Null Hypothesis is accepted and conclude that vs and transmission, gear and carb are independent with each other. . The p-value for the cylinder vs gear table is 0.0012141, The null hypothesis is rejected and conclude that the cyl and gear are dependent on each other, but the smaller values might come into play when the sample size is increased.