EXERCISE 1
The null hypothesis is there is no significant difference between the temperatures of males and females. The alternative hypothesis is there is a significant difference between the temperatures of males and females.
Read in data
normtemp = read.csv("/Users/jessieeastburn/Documents/Fall 2021/GEOG 6000/datafiles/normtemp.csv")
names(normtemp)
## [1] "X" "temp" "sex" "weight"
Split the temperature values into two new vectors. First turn into factor to assign 1 and 2 to male and female respectively
normtemp$fsex <- factor(normtemp$sex,
levels = c(1, 2),
labels = c("male", "female"))
Split into male and female vectors
normtempmale = normtemp[1:65,1:5]
normtempfemale = normtemp[66:130,1:5]
Run the t.test
t.test(normtempfemale$temp, normtempmale$temp, alternative = 'two.sided')
##
## Welch Two Sample t-test
##
## data: normtempfemale$temp and normtempmale$temp
## t = 2.2854, df = 127.51, p-value = 0.02394
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.03881298 0.53964856
## sample estimates:
## mean of x mean of y
## 98.39385 98.10462
The t-statistic is 2.2854 and the p value is 0.02394Â Is there a basis for difference in body temperature between men and women? Yes there is a basis for the difference based on the p and t value because the p value is less than 0.05 so we can reject the null hypothesis that there is no difference. Because we can reject the null hypothesis, we can fail to reject that there is a difference in body temperatures between men and women.
EXERCISE 2
Read in gapC.csv
gapc = read.csv("/Users/jessieeastburn/Documents/Fall 2021/GEOG 6000/datafiles/gapC.csv")
The null hypothesis is that life expectancy does not vary between geographical regions. The alternative hypothesis is that life expectancy does vary between geographical regions.
Create a boxplot of life expectancy by continent
newgapc = gapc[,c("lifeexpectancy","continent")]
boxplot(lifeexpectancy~continent, data=newgapc, xlab="Continent", ylab = "Life Expectancy", main = "Life Expectancy Per Continent")
Carry out the ANOVA and give the F-statistic and the p-value obtained
anova <- aov(lifeexpectancy ~ continent, data = newgapc)
summary(anova)
## Df Sum Sq Mean Sq F value Pr(>F)
## continent 6 9757 1626.2 37.57 <2e-16 ***
## Residuals 165 7141 43.3
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1 observation deleted due to missingness
The F statistic is 37.57 and the p value is 2e-16. The p value of 2e-16 is less than the significance level of 0.05, so i can reject the null hypothesis. Since the null hypothesis is rejected, we can fail to reject the alternative hypothesis, therefore, we can say that life expectancy does vary across continents.