To begin we set a working directory and read in the CSV.
Then code the gender from 1,2 to male, female:
normtemp$sex = factor(normtemp$sex,
levels = c(1, 2),
labels = c("male", "female"))
Establish our Null and alternative hypothesis:
µ_1 = male body temp µ_2 = female body temp
Null hypothesis (h_0): µ_1 = µ_2
Alternative hypothesis (h_1): µ_2 > µ_1
Then run a 2-sample t-test:
t.test(temp ~ normtemp$sex, normtemp)
##
## Welch Two Sample t-test
##
## data: temp by normtemp$sex
## t = -2.2854, df = 127.51, p-value = 0.02394
## alternative hypothesis: true difference in means between group male and group female is not equal to 0
## 95 percent confidence interval:
## -0.53964856 -0.03881298
## sample estimates:
## mean in group male mean in group female
## 98.10462 98.39385
The p-value in the report is under 0.05 so we reject the null hypothesis. Women have a higher body temperature (µ_2 > µ_1).
#Exercise 2
Read csv into R
gap = read.csv("/Users/annapeterson/Desktop/Classes/GEOG6000/Lab02/gapC.csv")
We establish our null and alternative hypothesis as whether or not the difference between life expectancy and country are 0 or non-zero.
Null hypothesis (h_0): µ_1 - µ_2 = 0 Alternative hypothesis: µ_1 - µ_2 ≠0
Make a boxplot
boxplot(gap$lifeexpectancy ~ gap$continent,
data = gap,
main = "Life Expectency per Continent",
xlab = "Continent",
ylab = "Life expectancy")
Run ANOVA with F-stat and p-value
anova = aov(gap$lifeexpectancy ~ gap$continent, data = gap)
summary(anova)
## Df Sum Sq Mean Sq F value Pr(>F)
## gap$continent 6 9757 1626.2 37.57 <2e-16 ***
## Residuals 165 7141 43.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1 observation deleted due to missingness
p-value is very low so we reject the null life expectancy is different from country to country.