Why cross-validation is better than split?

If all of your estimates give similar outputs, you can be more certain of the model’s accuracy. If your estimates give different outputs, that tells you the model does not perform consistently and suggests a problem with it.

# Fit lm model using 10-fold CV: model
model <- train(
  price ~ ., diamonds,
  method = "lm",
  trControl = trainControl(
    method = "cv", number = 10,
    verboseIter = TRUE
  )
)
+ Fold01: parameter=none 
- Fold01: parameter=none 
+ Fold02: parameter=none 
- Fold02: parameter=none 
+ Fold03: parameter=none 
- Fold03: parameter=none 
+ Fold04: parameter=none 
- Fold04: parameter=none 
+ Fold05: parameter=none 
- Fold05: parameter=none 
+ Fold06: parameter=none 
- Fold06: parameter=none 
+ Fold07: parameter=none 
- Fold07: parameter=none 
+ Fold08: parameter=none 
- Fold08: parameter=none 
+ Fold09: parameter=none 
- Fold09: parameter=none 
+ Fold10: parameter=none 
- Fold10: parameter=none 
Aggregating results
Fitting final model on full training set
# Print model to console
print(model)
Linear Regression 

53940 samples
    9 predictor

No pre-processing
Resampling: Cross-Validated (10 fold) 
Summary of sample sizes: 48546, 48545, 48545, 48545, 48546, 48547, ... 
Resampling results:

  RMSE      Rsquared 
  1131.163  0.9197323

 
# Fit lm model using 5-fold CV: model
model <- train(
  medv ~ ., Boston,
  method = "lm",
  trControl = trainControl(
    method = "cv", number = 5,
    verboseIter = TRUE
  )
)
Error in eval(expr, envir, enclos) : object 'Boston' not found

Why use repeated CV?

You can do more than just one iteration of cross-validation. Repeated cross-validation gives you a better estimate of the test-set error. You can also repeat the entire cross-validation procedure. This takes longer, but gives you many more out-of-sample datasets to look at and much more precise assessments of how well the model performs.

# Fit lm model using 5 x 5-fold CV: model
model <- train(
  medv ~ ., Boston,
  method = "lm",
  trControl = trainControl(
    method = "cv", number = 5,
    repeats = 5, verboseIter = TRUE
  )
)
Error in eval(expr, envir, enclos) : object 'Boston' not found

Predict on full dataset

# Predict on full Boston dataset
predict(model, Boston)
LS0tCnRpdGxlOiAiUiBOb3RlYm9vayIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKIyMgV2h5IGNyb3NzLXZhbGlkYXRpb24gaXMgYmV0dGVyIHRoYW4gc3BsaXQ/CgpJZiBhbGwgb2YgeW91ciBlc3RpbWF0ZXMgZ2l2ZSBzaW1pbGFyIG91dHB1dHMsIHlvdSBjYW4gYmUgbW9yZSBjZXJ0YWluIG9mIHRoZSBtb2RlbCdzIGFjY3VyYWN5LiBJZiB5b3VyIGVzdGltYXRlcyBnaXZlIGRpZmZlcmVudCBvdXRwdXRzLCB0aGF0IHRlbGxzIHlvdSB0aGUgbW9kZWwgZG9lcyBub3QgcGVyZm9ybSBjb25zaXN0ZW50bHkgYW5kIHN1Z2dlc3RzIGEgcHJvYmxlbSB3aXRoIGl0LgoKYGBge3J9CiMgRml0IGxtIG1vZGVsIHVzaW5nIDEwLWZvbGQgQ1Y6IG1vZGVsCm1vZGVsIDwtIHRyYWluKAogIHByaWNlIH4gLiwgZGlhbW9uZHMsCiAgbWV0aG9kID0gImxtIiwKICB0ckNvbnRyb2wgPSB0cmFpbkNvbnRyb2woCiAgICBtZXRob2QgPSAiY3YiLCBudW1iZXIgPSAxMCwKICAgIHZlcmJvc2VJdGVyID0gVFJVRQogICkKKQoKIyBQcmludCBtb2RlbCB0byBjb25zb2xlCnByaW50KG1vZGVsKQoKIyBGaXQgbG0gbW9kZWwgdXNpbmcgNS1mb2xkIENWOiBtb2RlbAptb2RlbCA8LSB0cmFpbigKICBtZWR2IH4gLiwgQm9zdG9uLAogIG1ldGhvZCA9ICJsbSIsCiAgdHJDb250cm9sID0gdHJhaW5Db250cm9sKAogICAgbWV0aG9kID0gImN2IiwgbnVtYmVyID0gNSwKICAgIHZlcmJvc2VJdGVyID0gVFJVRQogICkKKQoKIyBQcmludCBtb2RlbCB0byBjb25zb2xlCnByaW50KG1vZGVsKQpgYGAKCiMjIFdoeSB1c2UgcmVwZWF0ZWQgQ1Y/IApZb3UgY2FuIGRvIG1vcmUgdGhhbiBqdXN0IG9uZSBpdGVyYXRpb24gb2YgY3Jvc3MtdmFsaWRhdGlvbi4gUmVwZWF0ZWQgY3Jvc3MtdmFsaWRhdGlvbiBnaXZlcyB5b3UgYSBiZXR0ZXIgZXN0aW1hdGUgb2YgdGhlIHRlc3Qtc2V0IGVycm9yLiBZb3UgY2FuIGFsc28gcmVwZWF0IHRoZSBlbnRpcmUgY3Jvc3MtdmFsaWRhdGlvbiBwcm9jZWR1cmUuIFRoaXMgdGFrZXMgbG9uZ2VyLCBidXQgZ2l2ZXMgeW91IG1hbnkgbW9yZSBvdXQtb2Ytc2FtcGxlIGRhdGFzZXRzIHRvIGxvb2sgYXQgYW5kIG11Y2ggbW9yZSBwcmVjaXNlIGFzc2Vzc21lbnRzIG9mIGhvdyB3ZWxsIHRoZSBtb2RlbCBwZXJmb3Jtcy4KCmBgYHtyfQojIEZpdCBsbSBtb2RlbCB1c2luZyA1IHggNS1mb2xkIENWOiBtb2RlbAptb2RlbCA8LSB0cmFpbigKICBtZWR2IH4gLiwgQm9zdG9uLAogIG1ldGhvZCA9ICJsbSIsCiAgdHJDb250cm9sID0gdHJhaW5Db250cm9sKAogICAgbWV0aG9kID0gImN2IiwgbnVtYmVyID0gNSwKICAgIHJlcGVhdHMgPSA1LCB2ZXJib3NlSXRlciA9IFRSVUUKICApCikKCiMgUHJpbnQgbW9kZWwgdG8gY29uc29sZQpwcmludChtb2RlbCkKYGBgCgojIyMjIFByZWRpY3Qgb24gZnVsbCBkYXRhc2V0CgpgYGB7cn0KIyBQcmVkaWN0IG9uIGZ1bGwgQm9zdG9uIGRhdGFzZXQKcHJlZGljdChtb2RlbCwgQm9zdG9uKQpgYGAKCg==