ASSIGNMENT3

Assignment #3 - t-tests and chi-squared

The assignment consists of 2 parts. Create an R-Markdown script (.rmd) and generate an html output for the code and text. Please keep in mind to use both code chunks, text, and other components of reproducible research as required.

Use the CO2 dataset in R -To get definitions of the columns type help(CO2) -Calculate means & standard deviations for 4 groups broken down by Type and Treatment -Perform one-way tests twice: once for Type and once for Treatment -Perform a two-way test for Type and Treatment
Use the mtcars dataset in R -Use the table() function with the following combinations -The variables vs and am -The variables gear and carb -The variables cyl and gear -For each of the three cases above guess what the results of a Chi-Squared analysis will be -Ignore warnings for low values in the cells -Perform a Chi-Squared analysis on the mtcars dataset for each of the three cases above

1. To get definitions of the columns type help(CO2)

str(CO2)

## Classes 'nfnGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame':   84 obs. of  5 variables:
##  $ Plant    : Ord.factor w/ 12 levels "Qn1"<"Qn2"<"Qn3"<..: 1 1 1 1 1 1 1 2 2 2 ...
##  $ Type     : Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Treatment: Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ...
##  $ conc     : num  95 175 250 350 500 675 1000 95 175 250 ...
##  $ uptake   : num  16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ...
##  - attr(*, "formula")=Class 'formula'  language uptake ~ conc | Plant
##   .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> 
##  - attr(*, "outer")=Class 'formula'  language ~Treatment * Type
##   .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> 
##  - attr(*, "labels")=List of 2
##   ..$ x: chr "Ambient carbon dioxide concentration"
##   ..$ y: chr "CO2 uptake rate"
##  - attr(*, "units")=List of 2
##   ..$ x: chr "(uL/L)"
##   ..$ y: chr "(umol/m^2 s)"

summary(CO2)

##      Plant             Type         Treatment       conc     
##  Qn1    : 7   Quebec     :42   nonchilled:42   Min.   :  95  
##  Qn2    : 7   Mississippi:42   chilled   :42   1st Qu.: 175  
##  Qn3    : 7                                    Median : 350  
##  Qc1    : 7                                    Mean   : 435  
##  Qc3    : 7                                    3rd Qu.: 675  
##  Qc2    : 7                                    Max.   :1000  
##  (Other):42                                                  
##      uptake     
##  Min.   : 7.70  
##  1st Qu.:17.90  
##  Median :28.30  
##  Mean   :27.21  
##  3rd Qu.:37.12  
##  Max.   :45.50  
##

colnames(CO2)

## [1] "Plant"     "Type"      "Treatment" "conc"      "uptake"

2. Calculate means & standard deviations for 4 groups broken down by Type and Treatment

aggregate(CO2[, "uptake"], list(CO2[, "Type"], CO2[, "Treatment"]), mean)

##       Group.1    Group.2        x
## 1      Quebec nonchilled 35.33333
## 2 Mississippi nonchilled 25.95238
## 3      Quebec    chilled 31.75238
## 4 Mississippi    chilled 15.81429

aggregate(CO2[, "uptake"], list(CO2[, "Type"], CO2[, "Treatment"]), sd)

##       Group.1    Group.2        x
## 1      Quebec nonchilled 9.596371
## 2 Mississippi nonchilled 7.402136
## 3      Quebec    chilled 9.644823
## 4 Mississippi    chilled 4.058976

3.Perform one-way tests twice: once for Type and once for Treatment

fit_Type=lm(uptake~Type, data=CO2)
anova(fit_Type)

## Analysis of Variance Table
## 
## Response: uptake
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## Type       1 3365.5  3365.5  43.519 3.835e-09 ***
## Residuals 82 6341.4    77.3                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

fit_Treatment=lm(uptake~Treatment, data=CO2)
anova(fit_Treatment)

## Analysis of Variance Table
## 
## Response: uptake
##           Df Sum Sq Mean Sq F value   Pr(>F)   
## Treatment  1  988.1  988.11  9.2931 0.003096 **
## Residuals 82 8718.9  106.33                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

.The p-value = 3.84. Therefore, Null hypothesis is rejected and the uptake rates is dependent on Plant origin.

.In test for Treatment, p-value = 0.0030957, Null hypothesis is rejected, Uptake rates is dependent on Treatment types. ## 4.Perform a two-way test for Type and Treatment

fit_Type_Treatment=lm(uptake~Type*Treatment, data=CO2)
anova(fit_Type_Treatment)

## Analysis of Variance Table
## 
## Response: uptake
##                Df Sum Sq Mean Sq F value    Pr(>F)    
## Type            1 3365.5  3365.5 52.5086 2.378e-10 ***
## Treatment       1  988.1   988.1 15.4164 0.0001817 ***
## Type:Treatment  1  225.7   225.7  3.5218 0.0642128 .  
## Residuals      80 5127.6    64.1                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

. Because the p-value for both Type and Treatment is very low, the Null hypothesis is rejected and both the Type and the Treatment effcet the uptake rate. . Because the p-value for the interaction term is 0.06, Null is accepted and conclude that the no significant effect on uptake rates by Type and Treatment. ### 2.Use the mtcars dataset in R Use the table() function with the following combinations 1.The variables vs and am

VsTransmission=with(mtcars, table(vs,am))
VsTransmission

##    am
## vs   0  1
##   0 12  6
##   1  7  7

2.The variables gear and carb

GearCarburetors=with(mtcars, table(gear, carb))
GearCarburetors

##     carb
## gear 1 2 3 4 6 8
##    3 3 4 3 5 0 0
##    4 4 4 0 4 0 0
##    5 0 2 0 1 1 1

3.The variables cyl and gear

CylindersGear=with(mtcars, table(cyl, gear))
CylindersGear

##    gear
## cyl  3  4  5
##   4  1  8  2
##   6  2  4  1
##   8 12  0  2

2.Perform a Chi-Squared analysis on the mtcars dataset for each of the three cases above

chisq.test(VsTransmission)

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  VsTransmission
## X-squared = 0.34754, df = 1, p-value = 0.5555

chisq.test(GearCarburetors)

## Warning in chisq.test(GearCarburetors): Chi-squared approximation may be
## incorrect

## 
##  Pearson's Chi-squared test
## 
## data:  GearCarburetors
## X-squared = 16.518, df = 10, p-value = 0.08573

chisq.test(CylindersGear)

## Warning in chisq.test(CylindersGear): Chi-squared approximation may be
## incorrect

## 
##  Pearson's Chi-squared test
## 
## data:  CylindersGear
## X-squared = 18.036, df = 4, p-value = 0.001214

. As the P value for VsTransmission and GearCarburetors are higher than 0.05, the Null Hypothesis is accepted and conclude that vs and transmission, gear and carb are independent with each other. . The p-value for the cylinder vs gear table is 0.0012141, The null hypothesis is rejected and conclude that the cyl and gear are dependent on each other, but the smaller values might come into play when the sample size is increased.