Double check degrees of freedom vs. sample size in equations
Motulsky 2nd Ed, Chapter 30, page 220, Table 30.1. Maximal relaxaction of muscle strips of old and young rat bladders stimualted w/ high concentrations of nonrepinephrine (Frazier et al 2006). Response variable is %E.max
This is not a standard R format but just used for illustartion and displaying data
# The data
df.rats <- data.frame(old = c(20.8,2.8,50.0,33.3,29.4,38.9, 29.4,52.6,14.3),
young = c(45.5,55.0, 60.7, 61.5, 61.1, 65.5,42.9,37.5, NA))
#Means
## Calcualte means
mean.old <- with(df.rats, mean(old))
mean.young <- with(df.rats, mean(young, na.rm = T))
##put means into a vector
means.rats <- c(mean.old,mean.young)
## add means to dataframe
df.rats2 <- rbind(df.rats, means.rats)
##name the row that the means are in
row.names(df.rats2)[dim(df.rats2)[1]] <- "means"
#Standard deviation
##calc sd and add it to the dataframe
sd.old <- with(df.rats, sd(old))
sd.young <- with(df.rats, sd(young, na.rm = T))
sd.rats <- c(sd.old,sd.young)
df.rats2 <- rbind(df.rats2, sd.rats)
row.names(df.rats2)[dim(df.rats2)[1]] <- "SD"
#Sample size
n.old <- 9
n.young <- 8
n.rats <- c(n.old,n.young)
df.rats2 <- rbind(df.rats2, n.rats)
row.names(df.rats2)[dim(df.rats2)[1]] <- "N"
#Difference between means
## Calculate difference btween the means
diff.means <- df.rats2["means",2]-
df.rats2["means",1]
## Add difference two dataframe
df.rats2 <- rbind(df.rats2, c(diff.means,NA))
##Label row
row.names(df.rats2)[dim(df.rats2)[1]] <- "Diff. means"
#pander package makes nice tables
library(pander)
## Warning: package 'pander' was built under R version 3.3.2
pander(df.rats2)
old | young | |
---|---|---|
1 | 20.8 | 45.5 |
2 | 2.8 | 55 |
3 | 50 | 60.7 |
4 | 33.3 | 61.5 |
5 | 29.4 | 61.1 |
6 | 38.9 | 65.5 |
7 | 29.4 | 42.9 |
8 | 52.6 | 37.5 |
9 | 14.3 | NA |
means | 30.17 | 53.71 |
SD | 16.09 | 10.36 |
N | 9 | 8 |
Diff. means | 23.55 | NA |
Steps
I’ll make some functions that do this
(need to double check this; am not very good at code for equations eyt)
In words in terms of variance: \[var_p = {\frac{df_1*var_1 + df_2*var_2}{df_1 + df_2}}\]
In words in terms of standard deviation (SD) \[var_p = {\frac{df_1*SD^2_1 + df_2*SD^2_2}{df_1 + df_2}}\]
In symbols, in terms of the variance (sigma^2) \[\sigma^2_p = {\frac{df_1*\sigma^2_1 + df_2*\sigma^2_1}{df_1 + df_2}}\]
Give this funtion the two sets of degrees of freedom and the standard deviations, and it returns the pooled variance.
#Formula for POOLED standard deviation
## Note the formulas squares SD to get variance
var.pooled <- function(df1,df2,SD1,SD2){
(df1*SD1^2 + df2*SD2^2)/(df1+df2)
}
### Calcualte pooled variance using function
Using data from our data frame df.rats2
# Calculate pooled variance
## extract N and SD for easy access
dfs <- df.rats2["N", ]
SDs <- df.rats2["SD",]
dfs
## old young
## N 9 8
SDs
## old young
## SD 16.09464 10.36373
## Apply function for pooled sd
### Note: df = sample size - 1
var.pool <-var.pooled(df1 = dfs[1]-1,
df2 = dfs[2]-1,
SD1 = SDs[1],
SD2 = SDs[2])
## Add to pooled variance dataframe
df.rats2["var.pooled",] <- c(var.pool,NA)
\[SE_d = \sqrt{{var^2_p}(\frac{1}{n_1} + \frac{1}{n_2})} \]
\[SE_d = \sqrt{{\sigma_p^2}(\frac{1}{n_1} + \frac{1}{n_2})} \]
This function takes the pooled variance and the two sample sizes.
# Standard error of difference
## Note that this uses sample size, NOT degrees of freedom (df)
SE.diff <- function(var.pool, n1,n2){
sqrt(var.pool*(1/n1 + 1/n2))
}
#Apply function
## Note: uses sample size, NOT df
se.dif <- SE.diff(var.pool,
n1 = dfs[1],
n2 = dfs[2])
#Update df
df.rats2["SE.diff",] <- c(se.dif, NA)
Table 30.2: Results of unpaired t-test. Motulsky Chapter 30, page 221.
pander(round(df.rats2,3))
old | young | |
---|---|---|
1 | 20.8 | 45.5 |
2 | 2.8 | 55 |
3 | 50 | 60.7 |
4 | 33.3 | 61.5 |
5 | 29.4 | 61.1 |
6 | 38.9 | 65.5 |
7 | 29.4 | 42.9 |
8 | 52.6 | 37.5 |
9 | 14.3 | NA |
means | 30.17 | 53.71 |
SD | 16.09 | 10.36 |
N | 9 | 8 |
Diff. means | 23.55 | NA |
var.pooled | 188.3 | NA |
SE.diff | 6.667 | NA |