GEOG 6000 Lab 2: Inference

EXERCISE 1

The null hypothesis is there is no significant difference between the temperatures of males and females. The alternative hypothesis is there is a significant difference between the temperatures of males and females.

Read in data

normtemp = read.csv("/Users/jessieeastburn/Documents/Fall 2021/GEOG 6000/datafiles/normtemp.csv")
names(normtemp)

## [1] "X"      "temp"   "sex"    "weight"

Split the temperature values into two new vectors. First turn into factor to assign 1 and 2 to male and female respectively

normtemp$fsex <- factor(normtemp$sex, 
                       levels = c(1, 2),
                       labels = c("male", "female"))

Split into male and female vectors

normtempmale = normtemp[1:65,1:5]
normtempfemale = normtemp[66:130,1:5]

Run the t.test

t.test(normtempfemale$temp, normtempmale$temp, alternative = 'two.sided')

## 
##  Welch Two Sample t-test
## 
## data:  normtempfemale$temp and normtempmale$temp
## t = 2.2854, df = 127.51, p-value = 0.02394
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.03881298 0.53964856
## sample estimates:
## mean of x mean of y 
##  98.39385  98.10462

The t-statistic is 2.2854 and the p value is 0.02394 Is there a basis for difference in body temperature between men and women? Yes there is a basis for the difference based on the p and t value because the p value is less than 0.05 so we can reject the null hypothesis that there is no difference. Because we can reject the null hypothesis, we can fail to reject that there is a difference in body temperatures between men and women.

EXERCISE 2

Read in gapC.csv

gapc = read.csv("/Users/jessieeastburn/Documents/Fall 2021/GEOG 6000/datafiles/gapC.csv")

The null hypothesis is that life expectancy does not vary between geographical regions. The alternative hypothesis is that life expectancy does vary between geographical regions.

Create a boxplot of life expectancy by continent

newgapc = gapc[,c("lifeexpectancy","continent")]
boxplot(lifeexpectancy~continent, data=newgapc, xlab="Continent", ylab = "Life Expectancy", main = "Life Expectancy Per Continent")

Carry out the ANOVA and give the F-statistic and the p-value obtained

anova <- aov(lifeexpectancy ~ continent, data = newgapc)
summary(anova)

##              Df Sum Sq Mean Sq F value Pr(>F)    
## continent     6   9757  1626.2   37.57 <2e-16 ***
## Residuals   165   7141    43.3                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1 observation deleted due to missingness

The F statistic is 37.57 and the p value is 2e-16. The p value of 2e-16 is less than the significance level of 0.05, so i can reject the null hypothesis. Since the null hypothesis is rejected, we can fail to reject the alternative hypothesis, therefore, we can say that life expectancy does vary across continents.

GEOG 6000 Lab 2: Inference

Jessie Eastburn

9/10/2021