R Markdown

The assignment consists of 2 parts. Create an R-Markdown script (.rmd) and generate an html output for the code and text. Please keep in mind to use both code chunks, text, and other components of reproducible research as required.

1. Use the CO2 dataset in R

  1. To get definitions of the columns type help(CO2)
## starting httpd help server ... done
  1. Calculate means & standard deviations for 4 groups broken down by Type and Treatment
stats.mean <- aggregate(uptake ~ Type + Treatment, data = CO2, FUN = mean)
colnames(stats.mean) <- c("Type", "Treatment", "uptake.mean")
stats.sd <-  aggregate(uptake ~ Type + Treatment, data = CO2, FUN = sd)
colnames(stats.sd) <- c("Type", "Treatment", "uptake.sd")
stats <- merge(stats.mean, stats.sd)
stats
##          Type  Treatment uptake.mean uptake.sd
## 1 Mississippi    chilled    15.81429  4.058976
## 2 Mississippi nonchilled    25.95238  7.402136
## 3      Quebec    chilled    31.75238  9.644823
## 4      Quebec nonchilled    35.33333  9.596371
  1. Perform one-way tests twice: once for Type and once for Treatment For Type:
library(car)
## Warning: package 'car' was built under R version 3.4.4
## Loading required package: carData
## Warning: package 'carData' was built under R version 3.4.4
leveneTest(uptake ~ Type, data = CO2)
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  1  0.1704 0.6808
##       82
oneway.test(uptake ~ Type, data = CO2, var.equal = TRUE)
## 
##  One-way analysis of means
## 
## data:  uptake and Type
## F = 43.519, num df = 1, denom df = 82, p-value = 3.835e-09

The P-value indicates that there is a significant difference in average CO2 uptake rate between Quebec plants and Mississipi plants

For Treatment:

leveneTest(uptake ~ Treatment, data = CO2)
## Levene's Test for Homogeneity of Variance (center = median)
##       Df F value Pr(>F)
## group  1  1.2999 0.2576
##       82
oneway.test(uptake ~ Treatment, data = CO2, var.equal = TRUE)
## 
##  One-way analysis of means
## 
## data:  uptake and Treatment
## F = 9.2931, num df = 1, denom df = 82, p-value = 0.003096

The p-value shows that there is a significant difference in average CO2 uptake rate between chilled plants and not-chilled plants.

  1. Perform a two-way test for Type and Treatment
CO2.2wayANOVA <- aov(uptake ~ Type * Treatment, data = CO2)
CO2.2wayANOVA
## Call:
##    aov(formula = uptake ~ Type * Treatment, data = CO2)
## 
## Terms:
##                     Type Treatment Type:Treatment Residuals
## Sum of Squares  3365.534   988.114        225.730  5127.597
## Deg. of Freedom        1         1              1        80
## 
## Residual standard error: 8.005933
## Estimated effects may be unbalanced
summary(CO2.2wayANOVA)
##                Df Sum Sq Mean Sq F value   Pr(>F)    
## Type            1   3366    3366  52.509 2.38e-10 ***
## Treatment       1    988     988  15.416 0.000182 ***
## Type:Treatment  1    226     226   3.522 0.064213 .  
## Residuals      80   5128      64                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the two-way ANOVA test, the p-value for the interaction between Type (origin of plant) and Treatment (chilled vs. non-chilled) is .064. So we can conclude that the interaction makes a borderline significant difference to the average CO2 uptake rate.

2. Use the mtcars dataset in R

  1. Use the table() function with the following combinations

  2. The variables vs and am

attach(mtcars)
vs_am <- table(vs, am)
vs_am
##    am
## vs   0  1
##   0 12  6
##   1  7  7
  1. The variables gear and carb
gear_carb <- table(gear, carb)
gear_carb
##     carb
## gear 1 2 3 4 6 8
##    3 3 4 3 5 0 0
##    4 4 4 0 4 0 0
##    5 0 2 0 1 1 1
  1. The variables cyl and gear
cyl_gear <- table(cyl, gear)
cyl_gear
##    gear
## cyl  3  4  5
##   4  1  8  2
##   6  2  4  1
##   8 12  0  2
  1. For each of the three cases above guess what the results of a Chi-Squared analysis will be
    1. Engine type and Transmission: NO significant difference
    2. forward gears and carburetors: Significant difference
    3. cylinders and forward gears: Significant difference
  2. Ignore warnings for low values in the cells

  3. Perform a Chi-Squared analysis on the mtcars dataset for each of the three cases above

chisq.test(vs_am)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  vs_am
## X-squared = 0.34754, df = 1, p-value = 0.5555

The null hypothesis of the Chi-squared test states that engine type (V engine or straight engine) and transmission (automatic or manual) are independent. From the test results, the p-value of the test is .5555 > .05, which means that the null hypothesis is accepted, and the two variables are independent.

chisq.test(gear_carb)
## Warning in chisq.test(gear_carb): Chi-squared approximation may be
## incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  gear_carb
## X-squared = 16.518, df = 10, p-value = 0.08573

The null hypothesis of the Chi-squared test states that number of forward gears and number of carburetors are independent. From the test result, the p-value is .086, which rejects the null hypothesis, and therefore the two variables are not independent.

chisq.test(cyl_gear)
## Warning in chisq.test(cyl_gear): Chi-squared approximation may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  cyl_gear
## X-squared = 18.036, df = 4, p-value = 0.001214

The null hypothesis of the Chi-squared test states that number of cylinders and number of forward gears are independent. From the test result, the p-value of the test is .001, in which case we should reject the null hypothesis, so that the two variables are not independent.