Learning outcomes
By the end of this tutorial, you should be able to:
Select the correct statistical test for your dataset
Run simple linear models in R
Interpret model outputs and diagnostic plots
Calculate F-ratios and degrees of freedom (DF)
Instructions
In your groups, answer the following questions based on your assigned
data set (see sections below). Discuss your findings in the group and
prepare a short summary for class discussion
Groupings
Group 1 (Motor Trend Car Road Tests)
Load the ‘mtcars’ data set using this code:
{data(mtcars)}
- Is there a linear relationship between horsepower (hp) and mpg?
- Which test would you use to compare mpg across 3+ cylinder
groups?
Group 2 (ToothGrowth)
Load the ‘toothGrowth’ data set using this code:
{data(ToothGrowth)}
- Does supplement type (OJ vs VC) affect tooth length?
- Does increasing dose increase tooth length?
- Which test would you use to compare tooth length between two
supplement types?
Group 3 (PlantGrowth)
Load the ‘PlantGrowth’ data set using this code:
{data(PlantGrowth)}
- Which test is appropriate for comparing three groups?
- Which group is statistically different with another?
- Interpret the F-ratio and p-value.
Group 4 (USArrests)
Load the ‘USArrests’ data set using this code:
{data(USArrests)}
- Which test is appropriate for checking correlation?
- Is murder rate correlated with assault rate?
- How to predict assault rate using urban population?
Group 5 (Iris Data)
Load the ‘iris’ data set using this code:
{data(iris)}
- Which test compares sepal length across 3 groups?
- Can petal width predict petal length?
Wrap-up discussion
- How do you decide whether to use a t-test, ANOVA, correlation, or
regression?
- What does an F-ratio and degrees of freedom represent?
- Why are diagnostic plots important when interpreting models?
Useful resources
- Linear Regression Assumptions and Diagnostics in R
- Understanding Analysis of Variance (ANOVA) and the F-test
- Degrees of freedom in R