Shiny app for exploration of mtcars data set

Christopher Hay
17/12/2017

Introduction

This presentation is part of assignment 3 as part of the “developing data products” course on Coursera. The shiny app I have developed for this assignment is designed to assist in exploratory data analysis with the mtcars data set.

mtcars data set, and purpose of app

The “mtcars” data set (Motor Trend Car Road Tests) is a dataset supplied with base r that is often used to build example models. From the r help file, The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models).

The shiny app I have created is designed to assist with basic data exploration of the mtcars data set, and specifically how each variable relates to the miles per gallon variable. The app allows users to select a variable, and generates a plot of the chosen variable's relationship with miles per gallon. The app also outputs the coefficients of a linear model of that variable's relationship with miles per gallon.

How does the app work?

The app has some pre-processing to convert some of the variables within mtcars from numeric to factor (e.g. number of cylinders, or the transmission of the variable). The app then has a drop down of each variable, and creates either a line plot of the variable's relationship with miles per gallon (if the variable is numeric), or a box plot showing the distribution of miles per gallon by the variable (if the variable is a factor).

The app then uses the “lm” function to fit a linear model of the chosen variable against miles per gallon. The app then outputs the coefficients calculated and their respective p-values below the graph. For a numeric variable the outputs are the intercept and slope of the linear line of best fit; for a categorical variable the output is the average of MPG for that variable. The p value (rounded to 4 significant digits) of that factor is also shown.

Example

The below is an example of the R script for the number of cylinders.

plot of chunk unnamed-chunk-1

             Factor Average Pr(>|t|)
mtcars1$cyl4      4   26.66        0
mtcars1$cyl6      6   19.74        0
mtcars1$cyl8      8    15.1        0