Georgy Makarov
March 31, 2020
This application is part of Developing Data Products Coursera course project http://www.coursera.org/learn/data-products/.
The application predicts the class of a car with random forest model based on engine displacement, number of cylinders, city mileage and highway mileage. The model was trained on mpg dataset from ggplot2 package. http://ggplot2.tidyverse.org/reference/mpg.html
This presentation is R presentation created in Rstudio.
The shiny app pitched by this presentation is at:http://georgymakarov.shinyapps.io/ddp_course_project/
The source code of the app is at: http://github.com/GeorgyMakarov/Shiny-car-predictor
Fuel efficiency decreases as displacement, number of cylinders increase. This is applicable to any class. This makes it possible to use the dataset for classification.
Density plots provide visual feedback to changes in values compared to the classes in the dataset.
Model uses random forest algorithm with cross-validation.
control <- trainControl(method = "cv", number = 5)
set.seed(seed)
fit.rf <- train(class ~., data = training, method = "rf", metric = "Accuracy", trControl = control)
pred.rf <- predict(fit.rf, training)
conf_m <- confusionMatrix(pred.rf, training$class)
conf_m$overall["Accuracy"]
Accuracy
0.8829787