This is a run throug of the Student’s t-distribution in R. It is part of my R training which I publish on my site: https://dataz4s.com/
Let’s graph the density of a t-distribution using the dt() function. We will create a vector of quantiles, apply the dt() funtions to this vector and then plot it:
# Creating the vector with the x-values for dt function
x_dt <- seq(- 5, 5, by = 0.01)
# Applying the dt() function
y_dt <- dt(x_dt, df = 3)
# Plotting
plot(y_dt, type = "l", main = "t-distribution density function example", las=1)
Say we calculate a t-statistics of 1.9 using the t-stat formula:
\[t = \frac{\bar{x}-\mu}{s/\sqrt{n}}\]
# t-stat=1.9, df=15
# one-sided p-value
# P(t => 1.9)
pt(q=1.9, df=15, lower.tail = F)
## [1] 0.03841551
The p-value is 0.038 leading to rejection of H0 at a significance level greater than 0.038.
Let’s run a two-tailed test with the exampel above: t-stat=1.9, df=15
# two-sided p-value
# By adding the two tails
pt(q=1.9, df=15, lower.tail = F) + pt(q=-1.9, df=15, lower.tail = T)
## [1] 0.07683103
# By doubling one tail
pt(q=1.9, df=15, lower.tail = F)*2
## [1] 0.07683103
Let’s see how we can graph the t cumulative distribution function (CDF). We will start by creating a vector. Then we apply the pt() function and plot it:
# Creating the vector with the x-values for dt function
x_pt <- seq(- 5, 5, by = 0.01)
# Applying the dt() function
y_pt <- pt(x_pt, df = 3)
# Plotting
plot(y_pt, type = "l", main = "t-distribution cumulative function example", las=1)
Let’s find t for a 95% confidence interval with 2.5% in each tail and df=15
# find t for 95% confidence interval
# value of t with 2.5% in each tail
qt(p=0.025, df = 15, lower.tail = T)
## [1] -2.13145
We will specify the x-values with the seq() function for the qt() function, apply the qt() function and plot it:
# Specifyin the x-values
x_qt <- seq(0.1, by = 0.01)
# Applying the qt() function
y_qt <- qt(x_qt, df = 3)
# Plotting
plot(y_qt, main = "t quantile function example", las = 1)
Let’s use the rt() function to generate random variables. First, we will set a seed for reproducibility specifying also the sample size n that we with to simulate:
# Setting seed for reproducibility
set.seed(91929)
# Setting sample size
n <- 10000
# Using rt() to drawing N log normally distributed values
y_rt <- rt(n, df = 3)
# Plotting a histogram of y_rt
hist(y_rt, breaks = 100, main = "Randomly drawn t density")
Thanks to Joachim for your Statistics Globe site. This page is partly a run through of your t-distribution page: https://statisticsglobe.com/student-t-distribution-in-r-dt-pt-qt-rt