How does train/test ratio influnces accuracy?

Course project

Anton Votinov
Coursera student

You can play with the App here.

The aim of the App is to show the relationship between train-to-test proportion and the accuracy of predictions.
The App allows you to choose train-to-data proportion and randomization seed.
The output of the App lets you examine residuals of predictions based on train data set.

drawing drawing

As you can see on the previous slide, the inputs are the following:

The outputs are:

If you increase proportion to 0.875, sigma_sq increases too (22.92).

Proportion is set here to 0.5. server.R code differs a bit (input$proportion in place of proportion).

proportion <- 1/2
edge <- round(proportion*32,0)
edge <- sample(1:32,edge)
trainData <- mtcars[edge,]
testData <- mtcars[-edge,]

Creating train and test data.

fit <- lm(data = trainData, mpg ~ disp)
predictedValues <- predict(fit, testData)

Fitting the model and predictiong on test data to plot it later.