Keras is the package that allows for creating deep neural learning models in R ggplot loaded for creating plots
library(keras)
library(ggplot2)
The boston housing tries to predict the house prices using various parameters like crime rate in the locality the test train split
dataset<-dataset_boston_housing()
c(c(train_data, train_targets), c(test_data, test_targets))%<-%dataset
str(train_data)
## num [1:404, 1:13] 1.2325 0.0218 4.8982 0.0396 3.6931 ...
str(test_data)
## num [1:102, 1:13] 18.0846 0.1233 0.055 1.2735 0.0715 ...
str(train_targets)
## num [1:404(1d)] 15.2 42.3 50 21.1 17.7 18.5 11.3 15.6 15.6 14.4 ...
data scaling is necessary for regression model beacuase various parameters can be measured in various units The training data mean and std deviation has to be used in test data as well. No assumptions can be made in the test data
mean<-apply(train_data, MARGIN = 2, mean)
std<-apply(train_data, 2, sd)
train_data<-scale(train_data, center = mean, scale=std)
test_data<-scale(test_data, center = mean, scale=std)
The following function uses Keras package to first create a dense layer (where all nodes in input layer are connected to all layers in the deeper layers) Smaller units allows for non-overfitting however under-fitting should be kept in mind
build_model<-function(){
model<-keras_model_sequential()%>%
layer_dense(units=64, activation='relu', input_shape = dim(train_data)[2])%>%
layer_dense(units=64, activation="relu")%>%
layer_dense(units = 1)
model%>%compile(
optimizer="rmsprop",
loss="mse",
metrics=c('mae')
)
}
The boston housing dataset is small hence. We use k fold cross validation -here 4 fold The following function genereates the MAE (mean absolute error) for each epoch (500) and stores in all_mae_history
k<-4
indices<-sample(1:nrow(train_data))
folds<-cut(indices, breaks = k, labels = F)
num_epochs <- 500
all_mae_histories <- NULL
for (i in 1:k) {
cat("processing fold #", i, "\n")
val_indices <- which(folds == i, arr.ind = TRUE)
val_data <- train_data[val_indices,]
val_targets <- train_targets[val_indices]
partial_train_data <- train_data[-val_indices,]
partial_train_targets <- train_targets[-val_indices]
model <- build_model()
history <- model %>% fit( partial_train_data, partial_train_targets,validation_data = list(val_data, val_targets), epochs = num_epochs, batch_size = 1, verbose = 0
)
mae_history <- history$metrics$val_mean_absolute_error
all_mae_histories <- rbind(all_mae_histories, mae_history)
}
## processing fold # 1
## processing fold # 2
## processing fold # 3
## processing fold # 4
The steep drop and relative pleauteaued phase of MAE is hypothesized the optimal epoch size
average_mae_history <- data.frame(
epoch = seq(1:ncol(all_mae_histories)),
validation_mae = apply(all_mae_histories, 2, mean)
)
ggplot(average_mae_history, aes(x = epoch, y = validation_mae)) + geom_line()
ggplot(average_mae_history, aes(x = epoch, y = validation_mae)) + geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
# Building a general model to fit on the test data Using the optimal epoch size for fitting onto the training data
model <- build_model()
model %>% fit(train_data, train_targets,
epochs = 80, batch_size = 16, verbose = 0)
result <- model %>% evaluate(test_data, test_targets)
result
## $loss
## [1] 19.47835
##
## $mean_absolute_error
## [1] 2.807683