Bharat Kulkarni 11/12/2017 ANLY 510 91
Assignment #3 The assignment consists of 2 parts. Create an R-Markdown script (.rmd) and generate an html output for the code and text. Please keep in mind to use both code chunks, text, and other components of reproducible research as required.
Use the CO2 dataset in R To get definitions of the columns type help(CO2) Calculate means & standard deviations for 4 groups broken down by Type and Treatment Perform one-way tests twice: once for Type and once for Treatment Perform a two-way test for Type and Treatment Use the mtcars dataset in R Use the table() function with the following combinations The variables vs and am The variables gear and carb The variables cyl and gear For each of the three cases above guess what the results of a Chi-Squared analysis will be Ignore warnings for low values in the cells Perform a Chi-Squared analysis on the mtcars dataset for each of the three cases above
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
Question 1: Use the CO2 dataset in R 1) To get definitions of the columns type help(CO2)
str(CO2)
## Classes 'nfnGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame': 84 obs. of 5 variables:
## $ Plant : Ord.factor w/ 12 levels "Qn1"<"Qn2"<"Qn3"<..: 1 1 1 1 1 1 1 2 2 2 ...
## $ Type : Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ...
## $ Treatment: Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ...
## $ conc : num 95 175 250 350 500 675 1000 95 175 250 ...
## $ uptake : num 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
## - attr(*, "formula")=Class 'formula' language uptake ~ conc | Plant
## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
## - attr(*, "outer")=Class 'formula' language ~Treatment * Type
## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
## - attr(*, "labels")=List of 2
## ..$ x: chr "Ambient carbon dioxide concentration"
## ..$ y: chr "CO2 uptake rate"
## - attr(*, "units")=List of 2
## ..$ x: chr "(uL/L)"
## ..$ y: chr "(umol/m^2 s)"
stats.mean <-
aggregate(uptake ~ Type + Treatment,
data = CO2,
FUN = mean)
colnames(stats.mean) <- c("Type", "Treatment", "uptake.mean")
stats.sd <-
aggregate(uptake ~ Type + Treatment,
data = CO2,
FUN = sd)
colnames(stats.sd) <- c("Type", "Treatment", "uptake.sd")
## Change column names
stats <- merge(stats.mean, stats.sd)
stats
## Type Treatment uptake.mean uptake.sd
## 1 Mississippi chilled 15.81429 4.058976
## 2 Mississippi nonchilled 25.95238 7.402136
## 3 Quebec chilled 31.75238 9.644823
## 4 Quebec nonchilled 35.33333 9.596371
require(car)
## Loading required package: car
## Warning: package 'car' was built under R version 3.3.3
leveneTest(uptake ~ Type, data = CO2)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 0.1704 0.6808
## 82
## Levene's Test result: p-value = .6808. Accept homogeneity of variance.
oneway.test(uptake ~ Type, ## One-way test for Type
data = CO2,
var.equal = TRUE)
##
## One-way analysis of means
##
## data: uptake and Type
## F = 43.519, num df = 1, denom df = 82, p-value = 3.835e-09
leveneTest(uptake ~ Treatment, data = CO2)
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 1 1.2999 0.2576
## 82
oneway.test(uptake ~ Treatment,
data = CO2,
var.equal = TRUE)
##
## One-way analysis of means
##
## data: uptake and Treatment
## F = 9.2931, num df = 1, denom df = 82, p-value = 0.003096
CO2.2wayANOVA <-
aov(uptake ~ Type * Treatment,
data = CO2)
CO2.2wayANOVA
## Call:
## aov(formula = uptake ~ Type * Treatment, data = CO2)
##
## Terms:
## Type Treatment Type:Treatment Residuals
## Sum of Squares 3365.534 988.114 225.730 5127.597
## Deg. of Freedom 1 1 1 80
##
## Residual standard error: 8.005933
## Estimated effects may be unbalanced
summary(CO2.2wayANOVA)
## Df Sum Sq Mean Sq F value Pr(>F)
## Type 1 3366 3366 52.509 2.38e-10 ***
## Treatment 1 988 988 15.416 0.000182 ***
## Type:Treatment 1 226 226 3.522 0.064213 .
## Residuals 80 5128 64
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Use the table() function with the following combinations
attach(mtcars)
vs_am <- table(vs, am)
vs_am
## am
## vs 0 1
## 0 12 6
## 1 7 7
gear_carb <- table(gear, carb)
gear_carb
## carb
## gear 1 2 3 4 6 8
## 3 3 4 3 5 0 0
## 4 4 4 0 4 0 0
## 5 0 2 0 1 1 1
3. The variables cyl and gear
cyl_gear <- table(cyl, gear) ## Contingency table for cyl and gear
cyl_gear
## gear
## cyl 3 4 5
## 4 1 8 2
## 6 2 4 1
## 8 12 0 2
4. For each of the three cases above guess what the results of a Chi-Squared analysis will be
Answer: 1. vs (Engine type) and am (Transmission): No significant difference 2. gear (# forward gears) and carb (# carburetors): Significant difference 3. cyl (# cylinders) and gear (# forward gears): Significant difference
5. Ignore warnings for low values in the cells
2. Perform a Chi-Squared analysis on the mtcars dataset for each of the three cases above
chisq.test(vs_am)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: vs_am
## X-squared = 0.34754, df = 1, p-value = 0.5555
chisq.test(gear_carb)
## Warning in chisq.test(gear_carb): Chi-squared approximation may be
## incorrect
##
## Pearson's Chi-squared test
##
## data: gear_carb
## X-squared = 16.518, df = 10, p-value = 0.08573
chisq.test(cyl_gear)
## Warning in chisq.test(cyl_gear): Chi-squared approximation may be incorrect
##
## Pearson's Chi-squared test
##
## data: cyl_gear
## X-squared = 18.036, df = 4, p-value = 0.001214