Learning outcomes

By the end of this tutorial, you should be able to:

  1. Select the correct statistical test for your dataset

  2. Run simple linear models in R

  3. Interpret model outputs and diagnostic plots

  4. Calculate F-ratios and degrees of freedom (DF)

Instructions

In your groups, answer the following questions based on your assigned data set (see sections below). Discuss your findings in the group and prepare a short summary for class discussion

Groupings

Group 1 (Motor Trend Car Road Tests)

Load the ‘mtcars’ data set using this code:

{data(mtcars)}

  1. Is there a linear relationship between horsepower (hp) and mpg?
  2. Which test would you use to compare mpg across 3+ cylinder groups?

Group 2 (ToothGrowth)

Load the ‘toothGrowth’ data set using this code:

{data(ToothGrowth)}

  1. Does supplement type (OJ vs VC) affect tooth length?
  2. Does increasing dose increase tooth length?
  3. Which test would you use to compare tooth length between two supplement types?

Group 3 (PlantGrowth)

Load the ‘PlantGrowth’ data set using this code:

{data(PlantGrowth)}

  1. Which test is appropriate for comparing three groups?
  2. Which group is statistically different with another?
  3. Interpret the F-ratio and p-value.

Group 4 (USArrests)

Load the ‘USArrests’ data set using this code:

{data(USArrests)}

  1. Which test is appropriate for checking correlation?
  2. Is murder rate correlated with assault rate?
  3. How to predict assault rate using urban population?

Group 5 (Iris Data)

Load the ‘iris’ data set using this code:

{data(iris)}

  1. Which test compares sepal length across 3 groups?
  2. Can petal width predict petal length?

Wrap-up discussion

  1. How do you decide whether to use a t-test, ANOVA, correlation, or regression?
  2. What does an F-ratio and degrees of freedom represent?
  3. Why are diagnostic plots important when interpreting models?