Aysegul Sonmez
March 25, 2018
This presentation is one half of the assignemnt of week 4, Developing Data Products course from Coursera (https://www.coursera.org/learn/data-products).
The presentation is meant to show information about the second half of the above mentioned assignment (a development project).
The presentation was generated using RStudio(https://www.rstudio.com) and Slidify(http://slidify.org) framework.
The second half of the mentioned assignment was to develop a web application. The application was named MPG Prediction. An instance is up & running at https://aysegulzemnos.shinyapps.io/DataProductsShinyWebapp/.
This application shows an example of making a web application using R and Shiny framework together.
In the application, a prediction model is generated, based on a car parameters dataset (HP, weight, gears, etc.). Then, a user can play freely with the UI values in order to simulate the parameters of an hypothetical car and be able to predict its MPG consumption.
The application and the current presentation source codes can be found at https://github.com/Aysegulzemnos/DataProductsWeek4Project. The application comprises 3 files: ui.R (UI), server.R (backend) and modelBuilding_source.R (Random Forest predictor).
Dataset used by the application is the Motor Trend Car Road Tests (from now on ‘mtcars’). The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption and 10 aspects of automobile design and performance for 32 automobiles (1973-74 models).
Datatable output was generated dynamically based on Cylinder,Transmission,Displacement, Horsepower and Weight
Next, the dataset structure:
str(mtcars)## 'data.frame': 32 obs. of 11 variables:
## $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
## $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
## $ disp: num 160 160 108 258 360 ...
## $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
## $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
## $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
## $ qsec: num 16.5 17 18.6 19.4 17 ...
## $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
## $ am : num 1 1 1 0 0 0 0 0 0 0 ...
## $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
## $ carb: num 4 4 1 1 2 1 4 2 2 4 ...
A Random Forest prediction model is generated and trained using the ‘mtcars’ dataset. The goal of this model is to predict the fuel consumption and Car types Selection" (mpg variable) based on the rest of the variables:
customTrainControl <- trainControl(method = "cv", number = 10)
carsRandomForestModelBuilder <- function() {
return(
train(
mpg ~ .,
data = mtcars,
method = "rf",
trControl = customTrainControl
)
)
}
carsRandomForestModelBuilder()## Random Forest
##
## 32 samples
## 10 predictors
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 28, 28, 29, 30, 29, 29, ...
## Resampling results across tuning parameters:
##
## mtry RMSE Rsquared MAE
## 2 2.258073 0.9279475 1.965923
## 6 2.166451 0.9102461 1.864282
## 10 2.181007 0.9143769 1.881482
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was mtry = 6.