Exploring Linear Models for the Motor Trend Car Data Set

Alec Dara-Abrams
August 22, 2015

1974 Motor Trend Car Data Set

  • Data extracted from 1974 Motor Trend US magazine
  • 32 observations
  • 11 variables
  • Fuel consumption (mpg - Miles/US gallon)
  • cyl (Number of cylinders)
  • disp (Displacement in cubic inches)
  • hp (Gross horsepower)
  • drat (Rear axle ratio)
  • wt (Weight in lb/1000)
  • qsec (¼ mile time)
  • vs (V or straight engine)
  • am (transmission - 0 = automatic, 1 = manual)
  • gear (Number of forward gears)
  • carb (Number of carburetors)

Supporting Exploratory Data Analysis

  • Shiny app developed to support exploratory data analysis
  • User selects outcome variable
  • User selects any number of predictor variables
  • Scatter plots of single predictor on x-axis vs. outcome on y-axis
  • Compare models with different predictors using Adjusted R-squared values

Weight vs. MPG Example

  • Illustrate predicting outcome with selected predictor
  • Outcome variable selected – miles per gallon (mpg)
  • Predictor variable selected – weight (wt)
  • Below is the R expression which calculates this model's adjusted R-squared value
summary( lm( mpg ~ wt, data=mtcars) )$adj.r.squared
[1] 0.7445939

Explaining Variability in Outcome

  • Calculate Adjusted R-squared value

    • Indicates how much of the variability in the outcome can be explained
    • Using selected predictor variables
  • Use Adjusted R-squared value rather than R-squared value

    • Adjusted R-squared value takes into account number of estimated parameters
    • R-squared value tends to overestimate amount of variability model explains
  • Adjusted R-squared value expressed as decimal value between 0 and 1

    • 0 - predictor(s) explain none of variability in outcome
    • 1 - predictor(s) explain all of variability in outcome
    • Convert to percentage - proportion of variability that can be explained