7/6/2020

Introduction

Application

  • This application shows an example of making a web application using R and Shiny framework together.

  • In the application, a prediction model is generated, based on a car parameters dataset (HP, weight, gears, etc.). Then, a user can play freely with the UI values in order to simulate the parameters of an hypothetical car and be able to predict its MPG consumption.

Dataset

Dataset used by the application is the Motor Trend Car Road Tests (from now on ‘mtcars’). The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models).

Next, the dataset structure:

str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

Prediction

A Random Forest prediction model is generated and trained using the ‘mtcars’ dataset. The goal of this model is to predict the fuel consumption (mpg variable) based on the rest of the variables:

customTrainControl <- trainControl(method = "cv", number = 10)
carsRandomForestModelBuilder <- function() {
  return(
    train(
      mpg ~ ., 
      data = mtcars,
      method = "rf",
      trControl = customTrainControl
    )
  )
}
carsRandomForestModelBuilder()
## Random Forest 
## 
## 32 samples
## 10 predictors
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 28, 29, 29, 28, 28, 30, ... 
## Resampling results across tuning parameters:
## 
##   mtry  RMSE      Rsquared   MAE     
##    2    2.364902  0.9390737  2.037945
##    6    2.227004  0.9623011  1.911614
##   10    2.307681  0.9635279  1.981048
## 
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was mtry = 6.