Introduction

This presentation represents the reproducable pitch for the project assignment of week 4 in, Developing Data Products course from Coursera (https://www.coursera.org/learn/data-products).
The presentation is meant to show information about the associated shiny project that predicts the Miles Per Gallon gas consumption based on a number of variables from a dataset of cars.
The presentation was generated using RStudio(https://www.rstudio.com) and Slidify(http://slidify.org) framework.

Application

The project assignment was to develop a web application. The application was named Miles Per Gallon Prediction. An instance is up & running at https://egruhn.shinyapps.io/Shiny_Project/.
This application shows an example of making a web application using R and Shiny framework together.
In the application, a prediction model is generated, based on a car parameters dataset (HP, weight, gears, etc.). The user can freely adjust the UI values in order to simulate the parameters of a hypothetical car and be able to predict its Miles Per Gallon consumption.
The application and the current presentation source codes can be found at https://github.com/egruhn/Developing-Data-Products-project. The contains 3 files: ui.R (UI), server.R (backend) and rfModel.R (Random Forest predictor).

Dataset

Dataset used by the application is the Motor Trend Car Road Tests (from now on ‘mtcars’). The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models).

Next, the dataset structure:

str(mtcars)

## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

Prediction

A Random Forest prediction model is generated and trained using the ‘mtcars’ dataset. The goal of this model is to predict the fuel consumption (mpg variable) based on the rest of the variables:

customTrainControl <- trainControl(method = "cv", number = 10)
carsRFModel <- function() {
  return(
    train(
      mpg ~ ., 
      data = mtcars,
      method = "rf",
      trControl = customTrainControl
    )
  )
}

carsRFModel()

## Random Forest 
## 
## 32 samples
## 10 predictors
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 29, 28, 29, 30, 28, 28, ... 
## Resampling results across tuning parameters:
## 
##   mtry  RMSE      Rsquared   MAE     
##    2    2.388950  0.9100982  2.075167
##    6    2.338089  0.9427356  2.032172
##   10    2.317244  0.9509582  2.018181
## 
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was mtry = 10.

Developing Data Products project - Shiny Application and Reproducible Pitch

Introduction

Application

Dataset

Prediction