Chapter 4. Samples & Student’s T-Test

(1) Z-test

jordan_1984_85 <- read.csv(file = "Data Files/jordan_1984_85.csv", stringsAsFactors = FALSE)
attach(jordan_1984_85)
scores
##  [1] 16 21 37 25 17 25 33 27 45 27 16 34 35 23 30 13 22 20 20 20 21 20 27
## [24] 21 34 14 18 34 32 45 21 25 22 42 36 23 24 35 27 25 38 29 22 45 26 38
## [47] 31 41 23 49 17 26 16 26 38 28 24 21 37 37 33 26 28 21 32 16 27 32 31
## [70] 38 20 26 38 35 38 31 25 40 33 22 28 29
sample_scores_1 <- sample(x = scores, size = 30); sample_scores_2 <- sample(x = scores, size = 30)

sample_mean_1 <- mean(sample_scores_1); sample_mean_2 <- mean(sample_scores_2)

population_mean <- mean(scores);

SE_1 <- sqrt(var(sample_scores_1)/length(sample_scores_1)); SE_2 <- sqrt(var(sample_scores_2)/length(sample_scores_2))

z_1 = (sample_mean_1 - population_mean) / SE_1; z_2 = (sample_mean_2 - population_mean) / SE_2

jordan_z_test_1 <- data.frame(
                             category = c("sample_1", "sample_2"), 
                             mean_sample = c(sample_mean_1, sample_mean_2), 
                             mean_ppl = c(population_mean, population_mean), 
                             Standard_error = c(SE_1, SE_2), 
                             Z_test = c(z_1, z_2)
                           )
jordan_z_test_1
##   category mean_sample mean_ppl Standard_error     Z_test
## 1 sample_1    27.70000 28.20732       1.432906 -0.3540478
## 2 sample_2    27.43333 28.20732       1.332773 -0.5807319
  • Now, you can see the different results of z-cost from each sample. More importantly, Two sample shows slightly different means, let’s called sampling error. However, Distribution of sample means will have a mean close to population. Thus, for Z-test, the Mean of sampling almost equal to the Mean of population. As sample size increases, standard error decreases. Let’s see the table below.
sample_scores_3 <- sample(x = scores, size = 30); sample_scores_4 <- sample(x = scores, size = 45); sample_scores_5 <- sample(x = scores, size = 60)

sample_mean_3 <- mean(sample_scores_3); sample_mean_4 <- mean(sample_scores_4); sample_mean_5 <- mean(sample_scores_5)

population_mean <- mean(scores);

SE_3 <- sqrt(var(sample_scores_3)/length(sample_scores_3)); SE_4 <- sqrt(var(sample_scores_4)/length(sample_scores_4)); SE_5 <- sqrt(var(sample_scores_5)/length(sample_scores_5))

z_3 = (sample_mean_3 - population_mean) / SE_3; z_4 = (sample_mean_4 - population_mean) / SE_4; z_5 = (sample_mean_5 - population_mean) / SE_5

jordan_z_test_2 <- data.frame(
                             category = c("sample_3", "sample_4", "sample_5"),
                             sample_size = c(30,45,60),
                             mean_sample = c(sample_mean_3, sample_mean_4, sample_mean_5), 
                             mean_ppl = c(population_mean, population_mean, population_mean), 
                             Standard_error = c(SE_3, SE_4, SE_5), 
                             Z_test = c(z_3, z_4, z_5)
                           )
jordan_z_test_2
##   category sample_size mean_sample mean_ppl Standard_error     Z_test
## 1 sample_3          30    26.76667 28.20732       1.538534 -0.9363784
## 2 sample_4          45    29.75556 28.20732       1.172563  1.3203880
## 3 sample_5          60    28.36667 28.20732       1.022064  0.1559097

(2) T-test (single test)

  • Keywords: Compare sample mean to population mean, *UnKnown Standard Deviation.
  • One-Sample T test is used when the population is smaller than 30. Then, why do we need this? T-test is used to test hypotheses about the mean value of a population from which a sample is drawn. In this case, the null hypothesis that population mean is equal to M0, and the alternative hypothesis that it is not equal to M1.
  • Now, let’s perform a one-sample t-test with t.test function.
single_t_value <- (mean(sample_scores_1) - mean(jordan_1984_85$scores))/ sqrt(var(sample_scores_1)/length(sample_scores_1))

single_t_value
## [1] -0.3540478
t.test(sample_scores_1, mu = 30, alternative = "less", conf.level = 0.95)
## 
##  One Sample t-test
## 
## data:  sample_scores_1
## t = -1.6051, df = 29, p-value = 0.05965
## alternative hypothesis: true mean is less than 30
## 95 percent confidence interval:
##      -Inf 30.13469
## sample estimates:
## mean of x 
##      27.7
  • From the output, we can see that the man Jordan’s score for the sample_scores_1 is 30.0667. The ond-sided 95% confidence interval tells us that mean scoring is likely to be less than 32.6. The p-value of 0.7585 tells us that if the mean scoring volume of the Jordan were 29, the probability of selecting a sample with mean volume less than or equal to this one would be approximately 51%.

  • Since the p-value is not less than the significance level of 0.05, we can’t reject the null hypothesis that mean scoring is equal to 30. This means that there is no evidence that the scores are being under-scored.

(3) T-test (Dependent)

  • Depedent test is to compare two means in different samples drawn from the same population.
  • Observed mean value is sample mean of different scores (e.g. sample_mean_1 - sample_mean_2).

  • Expected mean value is population mean of difference socres.

  • SE value is SE of the mean difference.

  • Dependent test is popular known as pre-test vs pro-test in the same population.

  • Let’s see graph below.

wm <- read.csv(file = "Data Files/wm.csv", stringsAsFactors = FALSE)

wm_t <- subset(wm, wm$train == 1)

# summary statistics
library(psych)
describe.by(wm_t)
## Warning: describe.by is deprecated. Please use the describeBy function
## Warning: 강제형변환에 의해 생성된 NA 입니다
## Warning in FUN(newX[, i], ...): min에 전달되는 인자들 중 누락이 있어 Inf를
## 반환합니다
## Warning in FUN(newX[, i], ...): max에 전달되는 인자들 중 누락이 있어 -Inf를
## 반환합니다
## Warning in describeBy(x = x, group = group, mat = mat, type = type, ...):
## no grouping variable requested
##       vars  n  mean   sd median trimmed  mad min  max range skew kurtosis
## cond*    1 80   NaN   NA     NA     NaN   NA Inf -Inf  -Inf   NA       NA
## pre      2 80 10.03 1.37     10   10.03 1.48   8   12     4 0.10    -1.24
## post     3 80 13.51 2.54     14   13.50 2.97   7   19    12 0.00    -0.24
## gain     4 80  3.49 2.15      3    3.41 1.48  -1    9    10 0.34    -0.25
## train    5 80  1.00 0.00      1    1.00 0.00   1    1     0  NaN      NaN
##         se
## cond*   NA
## pre   0.15
## post  0.28
## gain  0.24
## train 0.00
  • From the output, researchers can see the vars, n, mean, sd, median trimmed, mad, min, max, range, skew, kurtosis, se.
# Create a boxplot with pre- and post-training groups 
boxplot(wm_t$pre, wm_t$post, main = "Boxplot",
        xlab = "Pre- and Post-Training", ylab = "Intelligence Score", 
        col = c("red", "green"))

  • In statistics, the main question to figure out is “does it happen by chance or not?” Let us find out.

  • Conducting a dependent t-test, also known as a paired t-test, requires following steps.
  • Define null and alternative hypotheses
  • Decide significance level α
  • Compute observed t-value
  • Find critical value
  • Compare observed value to critical value

  • In our case, our null hypothesis is that there’s no effect.

# Define the sample size
n <- nrow(wm_t)

# Mean of the different scores
mean_diff <- sum(wm_t$gain) / n # mean(wm_t$gain)

# standard deviation of the different scores
sd_diff <- sqrt(sum((mean_diff - wm_t$gain)^2) / (n-1)) 

# Obsered t-value
t_obs <- mean_diff / (sd_diff / sqrt(n))

t_obs
## [1] 14.49238
# Compute the critical value
t_crit <- qt(0.975, df = 79)

# Print the critical value
t_crit
## [1] 1.99045
# Print the observed t-value to compare 
t_obs
## [1] 14.49238
# Compute Cohen's d
cohens_d <- mean_diff / sd_diff

# View Cohen's d
cohens_d
## [1] 1.620297
  • Now, we get each value. Let’s compare two values. The observed t-value is 14.49238, and the critical value is 1.99045. The observed t-value is significantly larger than the critical value, which tells us the the difference is significant at a significance level of 0.05.

  • Let’s see cohens_d is 1.620297. A Cohen’s d of 1.62 means that the intelligence scores of our subjects changed by 1.62 standard deviations, which is very large.

# Apply the t.test function
t.test(wm_t$post, wm_t$pre, paired = TRUE)
## 
##  Paired t-test
## 
## data:  wm_t$post and wm_t$pre
## t = 14.492, df = 79, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  3.008511 3.966489
## sample estimates:
## mean of the differences 
##                  3.4875
# Calculate Cohen's d
# install.packages("lsr")
library(lsr)
cohensD(wm_t$post, wm_t$pre, method = "paired")
## [1] 1.620297
  • Question, let’s get t-value from this condition.
  • N = 100
  • Mean = 10
  • Standard Error = 5 t-value = Mean of different scores(Observed - Expected) / Standard Error.