Imagine you want to know whether a new teaching method improves
student performance compared to the traditional one. You collect data
from two groups—one taught with the new method and the other with the
old one. But how can you tell if any difference in their scores is real
or just random chance? That’s where hypothesis testing comes in.
two-sample t-test
set.seed(125)
group1 <- c(rnorm(100, mean = 24, sd = 3))
group2 <- c(rnorm(100, mean = 43, sd = 2.4))
t.test(group1, group2)
plot(uptake ~ Treatment, data=CO2)
t.test(uptake ~ Treatment, data = CO2)
x <- c(1,1,2,2)
y <- c(1,2,2,3)
m <- cbind(x,y)
plot(y~x,xlim=c(0,2),ylim=c(0,4))
m <- cbind(m,xy=x*y,x2 = x*x,res_x=x-mean(x),res_y=y-mean(y))
lm1 <- lm(y~x)
summary(lm1)
- Residuals: represent the differences between the observed values and
the values predicted by the model.
- Intercept: The esitmate is 0.5, meaning that when \(x=0\), the model predicts \(y\) to be 0.5.
- coefficients; This is the estimated value of \(\beta_1\), the coefficient of \(x\). The estimate is 1, meaning that for
every one unit incresease in \(x\),
\(y\) is expected to increase by 1
unit. This suggests a direct proportional relationship between \(x\) and \(y\).
- std. error: the quantifies the uncertainty of the coefficient
estiates. A smaller value indicates more precise estimates.
- t value: this is the test statistic for the hypothesis that the
corresponding coefficient is equal to zeor (i.e., no effect).
- Pr: This indicates the probability of observing a test statistic as
extreme as the t-value, assuming the null hypothesis (coefficient = 0)
is true.
- Residual standard error: prediction error
- DF: number of observations - the number of estimated
paramenters
- R-squared; This is a measure of how well the model explains the
variability in the dependent variable \(y\).
- adjusted R-squared: This adjusts the R-squared value to account for
the number of predictors in the model.0.25 indicating that after
accounting for the number of predictors, only 25% of the variability in
\(y\) is explained by the model.
- F-statistic: This tests the null hypothesis that all the regression
coefficients are zero (i.e., no relationship between \(x\) and \(y\)).
- p-value: The p-value for the F-test is 0.2929, indicating that the
overall model is not statistically significant at typical significance
levels.