t test’s in R

one of the most common statistical tests is the t test, used to determine if the means of two groups are similar. Assuming normal diustribution, we can test how similar two groups are.

Let’s create two random vectors and test if thier means are the same

What’s the Null Hypothesis? The Alternate Hypothesis?
x = rnorm(10) # Creates a normally diostributed vector with 10 observations. Default is ve = 0 and sd = 1
y = rnorm(10)

t.test(x,y)
## 
##  Welch Two Sample t-test
## 
## data:  x and y
## t = -0.6704, df = 13.39, p-value = 0.514
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.242119  0.652459
## sample estimates:
##   mean of x   mean of y 
## -0.27075102  0.02407915

Look at the t tests, are these results what we expected?

With the creation of two vectors with similar means, does the p value look right?

Types of t tests

- one sample t test

- two sample t test

One Sample t test

Comparing a sample to a mean

What a one-sample testing question would look like:

The popultion of the so-and-so last year was XXX and you took a sample of 25 this year and the average was YYY with a Standard deviation of ZZZ. Was this significantly different from last years data? test to 5% certainty.

With a one sample t test - you need a numeric vector of the data you are trying to test and if you want a one-sided or two-sided test.

here is the function needed to perform a one sided t test:

TestingVector = rnorm(25,mean = 23, sd = 3) # A sample of just 25 with a mean of 25 and a SD of 3
t.test(TestingVector, mu = 27, alternative = "less") # Testing against a population mean of 27 and a one sided test
## 
##  One Sample t-test
## 
## data:  TestingVector
## t = -5.645, df = 24, p-value = 4.103e-06
## alternative hypothesis: true mean is less than 27
## 95 percent confidence interval:
##      -Inf 24.31014
## sample estimates:
## mean of x 
##  23.14035
# Of course you would uswe data youself, but here I just created a fake vector of data using the rnorm() function

Unpaired Two sample t test

Used to compare the means of two independend groups

Was there a difference between Treatment and Control? Men vs. Women? etc.

There are a couple of things you need to do beforehand

Test the variance with an F Test and normality with a shapiro wilkes test

women_weight = c(38.9, 61.2, 73.3, 21.8, 63.4, 64.6, 48.4, 48.8, 48.5)
men_weight = c(67.8, 60, 63.4, 76, 89.4, 73.3, 67.3, 61.3, 62.4) 

# Test the normality - using a shaprio-wilkes test
shapiro.test(men_weight)
## 
##  Shapiro-Wilk normality test
## 
## data:  men_weight
## W = 0.86425, p-value = 0.1066
shapiro.test(women_weight)
## 
##  Shapiro-Wilk normality test
## 
## data:  women_weight
## W = 0.94266, p-value = 0.6101

The p-values are bigger than .05 meaning we can say our weight ditributions are not significantly different from a normal distribution

Next, lets test the variablity and analyze the f statistics.

var.test(men_weight, women_weight)
## 
##  F test to compare two variances
## 
## data:  men_weight and women_weight
## F = 0.36134, num df = 8, denom df = 8, p-value = 0.1714
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.08150656 1.60191315
## sample estimates:
## ratio of variances 
##          0.3613398

The p value is above .05 , meaning there is no significant difference between the variance in the groups

Finally, run a two sample t test

t.test(men_weight, women_weight, var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  men_weight and women_weight
## t = 2.7842, df = 16, p-value = 0.01327
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##   4.029759 29.748019
## sample estimates:
## mean of x mean of y 
##  68.98889  52.10000
With the p value above, we can say that there is a significant difference between the two independent samples.

Pairwise t test

Heres a secret the final type of t test we are looking at, the pairwise t test, is really just a one sample t test in disguise!

The p value is low adn we can reject the null meaning the groups are different.