Jesus Ramos
November 22 2015
Can user ratings be used to predict business closure?
review_count_change_year review_stars_change_year
1 NA NA
2 0 -3
3 2 2
Note we'll only consider changes in review counts and stars.
The model does a decent job predicting the businesses that will remain open, but a poor job predicting those that have closed, as shown by the confusion matrix for the test set:
Reference
Prediction FALSE TRUE
FALSE 1 0
TRUE 2226 16008
Also, the ROC curve shows how poorly specific and sensitive the model is, with an accuracy of 87% and a specificity of 0%.
Call:
plot.roc.default(x = rfModel$pred$obs[selectedIndices], predictor = rfModel$pred$mtry[selectedIndices])
Data: rfModel$pred$mtry[selectedIndices] in 5198 controls (rfModel$pred$obs[selectedIndices] FALSE) < 37352 cases (rfModel$pred$obs[selectedIndices] TRUE).
Area under the curve: 0.5