Exercise 1

To begin we set a working directory and read in the CSV.

Then code the gender from 1,2 to male, female:

normtemp$sex = factor(normtemp$sex,
                      levels = c(1, 2),
                      labels = c("male", "female"))

Establish our Null and alternative hypothesis:

µ_1 = male body temp µ_2 = female body temp

Null hypothesis (h_0): µ_1 = µ_2

Alternative hypothesis (h_1): µ_2 > µ_1

Then run a 2-sample t-test:

t.test(temp ~ normtemp$sex, normtemp)
## 
##  Welch Two Sample t-test
## 
## data:  temp by normtemp$sex
## t = -2.2854, df = 127.51, p-value = 0.02394
## alternative hypothesis: true difference in means between group male and group female is not equal to 0
## 95 percent confidence interval:
##  -0.53964856 -0.03881298
## sample estimates:
##   mean in group male mean in group female 
##             98.10462             98.39385

The p-value in the report is under 0.05 so we reject the null hypothesis. Women have a higher body temperature (µ_2 > µ_1).

#Exercise 2

Read csv into R

gap = read.csv("/Users/annapeterson/Desktop/Classes/GEOG6000/Lab02/gapC.csv")

We establish our null and alternative hypothesis as whether or not the difference between life expectancy and country are 0 or non-zero.

Null hypothesis (h_0): µ_1 - µ_2 = 0 Alternative hypothesis: µ_1 - µ_2 ≠ 0

Make a boxplot

boxplot(gap$lifeexpectancy ~ gap$continent,
        data = gap,
        main = "Life Expectency per Continent",
        xlab = "Continent",
        ylab = "Life expectancy")

Run ANOVA with F-stat and p-value

anova = aov(gap$lifeexpectancy ~ gap$continent, data = gap)
summary(anova)
##                Df Sum Sq Mean Sq F value Pr(>F)    
## gap$continent   6   9757  1626.2   37.57 <2e-16 ***
## Residuals     165   7141    43.3                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1 observation deleted due to missingness

p-value is very low so we reject the null life expectancy is different from country to country.