str(CO2)
## Classes 'nfnGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame': 84 obs. of 5 variables:
## $ Plant : Ord.factor w/ 12 levels "Qn1"<"Qn2"<"Qn3"<..: 1 1 1 1 1 1 1 2 2 2 ...
## $ Type : Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ...
## $ Treatment: Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ...
## $ conc : num 95 175 250 350 500 675 1000 95 175 250 ...
## $ uptake : num 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
## - attr(*, "formula")=Class 'formula' language uptake ~ conc | Plant
## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
## - attr(*, "outer")=Class 'formula' language ~Treatment * Type
## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
## - attr(*, "labels")=List of 2
## ..$ x: chr "Ambient carbon dioxide concentration"
## ..$ y: chr "CO2 uptake rate"
## - attr(*, "units")=List of 2
## ..$ x: chr "(uL/L)"
## ..$ y: chr "(umol/m^2 s)"
summary(CO2)
## Plant Type Treatment conc
## Qn1 : 7 Quebec :42 nonchilled:42 Min. : 95
## Qn2 : 7 Mississippi:42 chilled :42 1st Qu.: 175
## Qn3 : 7 Median : 350
## Qc1 : 7 Mean : 435
## Qc3 : 7 3rd Qu.: 675
## Qc2 : 7 Max. :1000
## (Other):42
## uptake
## Min. : 7.70
## 1st Qu.:17.90
## Median :28.30
## Mean :27.21
## 3rd Qu.:37.12
## Max. :45.50
##
colnames(CO2)
## [1] "Plant" "Type" "Treatment" "conc" "uptake"
help("CO2")
## starting httpd help server ... done
aggregate(CO2[, "uptake"], list(CO2[, "Type"], CO2[, "Treatment"]), mean)
## Group.1 Group.2 x
## 1 Quebec nonchilled 35.33333
## 2 Mississippi nonchilled 25.95238
## 3 Quebec chilled 31.75238
## 4 Mississippi chilled 15.81429
aggregate(CO2[, "uptake"], list(CO2[, "Type"], CO2[, "Treatment"]), sd)
## Group.1 Group.2 x
## 1 Quebec nonchilled 9.596371
## 2 Mississippi nonchilled 7.402136
## 3 Quebec chilled 9.644823
## 4 Mississippi chilled 4.058976
2.Calculate means & standard deviations for 4 groups broken down by Type and Treatment
aggregate(CO2[, "uptake"], list(CO2[, "Type"], CO2[, "Treatment"]), mean)
## Group.1 Group.2 x
## 1 Quebec nonchilled 35.33333
## 2 Mississippi nonchilled 25.95238
## 3 Quebec chilled 31.75238
## 4 Mississippi chilled 15.81429
aggregate(CO2[, "uptake"], list(CO2[, "Type"], CO2[, "Treatment"]), sd)
## Group.1 Group.2 x
## 1 Quebec nonchilled 9.596371
## 2 Mississippi nonchilled 7.402136
## 3 Quebec chilled 9.644823
## 4 Mississippi chilled 4.058976
3. Perform one-way tests twice: once for Type and once for Treatment
fit_Type=lm(uptake~Type, data=CO2)
anova(fit_Type)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 1 3365.5 3365.5 43.519 3.835e-09 ***
## Residuals 82 6341.4 77.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
fit_Treatment=lm(uptake~Treatment, data=CO2)
anova(fit_Treatment)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Treatment 1 988.1 988.11 9.2931 0.003096 **
## Residuals 82 8718.9 106.33
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
P value = 3.84, so Null hypothesis is rejected and uptake rates are dependent on plant origin.
In test for treatment, p-value = 0.0030957, Null hypothesis is rejected, Uptake rates is dependent on Treatment Types.
4. Perform a two-way test for Type and Treatment
fit_Type_Treatment=lm(uptake~Type*Treatment, data=CO2)
anova(fit_Type_Treatment)
## Analysis of Variance Table
##
## Response: uptake
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 1 3365.5 3365.5 52.5086 2.378e-10 ***
## Treatment 1 988.1 988.1 15.4164 0.0001817 ***
## Type:Treatment 1 225.7 225.7 3.5218 0.0642128 .
## Residuals 80 5127.6 64.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
As the p-value for both Type and Treatment is very low, the Null hypothesis is rejected and both the Type and the Treatment affcet the uptake rate. Because the p-value for the interaction term is 0.06, Null is accepted and conclude that the no significant effect on uptake rates by Type and Treatment
VsTransmission=with(mtcars, table(vs,am))
VsTransmission
## am
## vs 0 1
## 0 12 6
## 1 7 7
2.The variables gear and carb
GearCarburetors=with(mtcars, table(gear, carb))
GearCarburetors
## carb
## gear 1 2 3 4 6 8
## 3 3 4 3 5 0 0
## 4 4 4 0 4 0 0
## 5 0 2 0 1 1 1
3.The variables cyl and gear
CylindersGear=with(mtcars, table(cyl, gear))
CylindersGear
## gear
## cyl 3 4 5
## 4 1 8 2
## 6 2 4 1
## 8 12 0 2
2.Perform a Chi-Squared analysis on the mtcars dataset for each of the three cases above
chisq.test(VsTransmission)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: VsTransmission
## X-squared = 0.34754, df = 1, p-value = 0.5555
chisq.test(GearCarburetors)
## Warning in chisq.test(GearCarburetors): Chi-squared approximation may be
## incorrect
##
## Pearson's Chi-squared test
##
## data: GearCarburetors
## X-squared = 16.518, df = 10, p-value = 0.08573
chisq.test(CylindersGear)
## Warning in chisq.test(CylindersGear): Chi-squared approximation may be
## incorrect
##
## Pearson's Chi-squared test
##
## data: CylindersGear
## X-squared = 18.036, df = 4, p-value = 0.001214
As the P value for VsTransmission and GearCarburetors are higher than 0.05, the Null Hypothesis is accepted and conclude that vs and transmission, gear and carb are independent with each other.
The p-value for the cylinder vs gear table is 0.0012141, The null hypothesis is rejected and conclude that the cyl and gear are dependent on each other, but the smaller values might come into play when the sample size is increased.